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ABSTRACT 


Title  of  Thesis:  The  Impact  of  Sociodemographic  Factors  on  Racial/Ethnic 

Differences  in  Tumor  Stage  and  Tumor  Size  for  Cancer  of  the 
Female  Breast 

Name,  degree,  year:  Barry  A.  Miller,  Doctor  of  Public  Health,  2000 

Thesis  directed  by:  Terry  L.  Thomas,  Ph.D.,  Associate  Professor,  Department  of 

Preventive  Medicine  and  Biometrics 

A  population-based,  case-control  study  was  conducted  to  determine  the 
importance  of  sociodemographic  factors  in  explaining  racial/ethnic  differences  in  tumor 
stage  and  size  at  the  time  of  diagnosis  among  women  with  invasive,  primary  breast 
cancer.  The  study  group  included  106,607  women  newly  diagnosed  with  breast  cancer 
during  the  years  1992  through  1996  while  residing  in  any  of  the  eleven  reporting  areas  in 
the  United  States  that  comprise  the  Surveillance,  Epidemiology,  and  End  Results  (SEER) 
program  of  the  National  Cancer  Institute  (NCI). 

Descriptive  tabulations  of  the  study  variables  indicated  that  Japanese  and  White 
women  tended  to  be  diagnosed  at  an  earlier  stage,  with  smaller  diameter  tumors,  and  at  a 
lower  tumor  grade  than  other  groups.  Black  and  Hispanic  women  were  more  likely  than 
other  groups  to  be  diagnosed  with  metastatic  disease,  with  tumors  2  cm  or  larger  in 
diameter,  and  with  poorly  differentiated  tumors.  In  the  regression  analysis,  elevated  odds 
ratios  among  Black  and  Hispanic  patients  for  later  stage  and  larger  size  tumors  were 
reduced  by  50%  to  60%  when  sociodemographic  factors  were  added  to  a  model  already 
containing  age  and  geographic  area.  Tumor  grade  and  hormone  receptor  status  only 
explained  a  small  amount  of  the  excess  odds  for  distant  stage  disease  among  Black  and 
Hispanic  women,  and  did  not  explain  any  of  the  racial/ethnic  differences  in  regional 


stage  disease  or  larger  tumor  size.  In  the  analysis  of  tumor  size,  odds  ratios  for  Black, 
Hispanic,  Filipino,  Chinese,  and  Korean  women  remained  elevated  relative  to  White 
women  after  adjustment  for  sociodemographic  factors,  tumor  grade,  and  hormone 
receptor  status.  Japanese  women,  conversely,  had  consistently  lower  odds  ratios 
(relative  to  White  women)  for  every  study  outcome. 

Results  from  this  study  suggest  that  sociodemographic  factors  account  for  a 
significant  portion  of  the  observed  racial/ethnic  differences  in  the  stage  of  disease  and 
tumor  size  at  the  time  of  diagnosis,  but  that  unmeasured  differences  in  socioeconomic  or 
biological  characteristics  of  breast  tumors  among  some  racial/ethnic  groups  may  also 
exist.  The  special  cancer  data  base  created  for  this  study  may  now  be  used  to  investigate 
the  importance  of  sociodemographic  factors  in  explaining  population  patterns  for  other 
types  of  cancer. 
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CHAPTER  I.  BACKGROUND  AND  LITERATURE  REVIEW 


A.  Background 

1.  Characterization  of  Breast  Tumors 

Breast  cancer  is  the  most  common  form  of  cancer  diagnosed  among  women  in  the 
United  States,  accounting  for  about  29%  of  all  malignancies  [ACS  1999].  It  is  also  the 
most  common  cancer  in  women  worldwide  [Parkin  1998].  About  16%  of  all  cancer 
deaths  among  U.S.  women  are  due  to  cancer  of  the  breast,  placing  it  second  to  cancer  of 
the  lung  and  bronchus  [ACS  1999]. 

Over  90%  of  breast  carcinomas  arise  as  a  neoplasm  of  the  ductal  epithelium,  with 
the  remainder  developing  as  lower  grade  neoplasms  from  the  lobular  epithelium 
[Henderson  1996].  About  15%  to  20%  of  breast  cancers  are  diagnosed  very  early  in  their 
natural  history  and  may  be  termed  carcinoma  in  situ  [PDQ  1999].  They  have  all  of  the 
characteristics  of  malignancy  except  invasion.  An  in  situ  cancer  has  not  penetrated  the 
basement  membrane  nor  extended  beyond  the  epithelial  tissue.  Some  common  synonyms 
are  intraepithelial  (confined  to  the  epithelial  tissue),  non-invasive,  and  non-infiltrating. 
Once  a  cancer  has  invaded  other  tissues  or  spread  to  other  parts  of  the  body,  it  is  termed 
invasive.  Several  histologic  types  of  invasive  breast  cancer  have  been  identified,  but 
ductal  carcinoma,  not  otherwise  specified,  is  by  far  the  most  commonly  recorded  type.  It 
comprises  about  80%  of  all  cases  [Berg  1995].  A  few  specific  variants  of  invasive  ductal 
carcinoma  have  a  better  prognosis  than  other  types.  They  include  pure  mucinous,  pure 
tubular,  pure  medullary  and  pure  papillary  carcinoma  [Fisher  1993,  Donegan  1997]. 
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These  special  types  form  a  small  group,  however,  representing  less  than  6%  of  all 
invasive  breast  cancers  [Berg  1995]. 

The  anatomic  extent  of  a  cancer,  determined  clinically  or  pathologically,  is  a 
classic  and  reliable  indicator  of  prognosis  [Simpson  1996,  Donegan  1997].  The  main 
components  used  in  classifying  the  extent  of  disease  are  size  of  the  tumor,  extension  of 
the  tumor,  evidence  of  metastasis,  and  lymph  node  involvement.  General  staging 
categories  for  invasive  breast  cancer  include  localized  (confined  to  the  breast  tissue  with 
no  lymph  node  involvement),  regional  (direct  invasion  to  extramammary  tissues  and/or 
metastasis  to  regional  lymph  nodes),  and  distant  (metastasis  beyond  regional  tissues) 
[Seiffert  1993].  These  categories  identify  three  general  groups  with  distinctly  different 
probabilities  for  survival  after  diagnosis  and  treatment.  Five-year  cumulative  relative 
survival  rates  associated  with  this  staging  scheme  for  patients  diagnosed  between  1988- 
94  through  the  Surveillance,  Epidemiology  and  End  Results  (SEER)  population-based 
registry  system  are  shown  in  Table  1-1.  This  general  staging  scheme  is  useful  for 
monitoring  time  trends  in  cancer  rates  for  surveillance  purposes  since  the  stage 
definitions  remain  comparable  over  time.  It  differs  from  the  more-detailed  clinical 
staging  scheme  developed  by  the  American  Joint  Committee  on  Cancer  (AJCC),  which 
makes  use  of  tumor  size  in  assigning  the  stage,  but  has  changed  its  staging  definitions 
over  time.  The  AJCC  staging  is  based  on  the  size  of  the  primary  tumor  (T),  the  absence 
or  presence  and  extent  of  regional  lymph  node  metastasis  (N),  and  the  absence  or 
presence  of  distant  metastasis  (M).  This  scheme  is  referred  to  as  the  TNM  system  and  is 
delineated  in  a  manual  published  by  the  AJCC  [AJCC  1997].  TNM-based  stage 
groupings  for  invasive  breast  cancers  are  summarized  in  Table  1-2.  A  limitation  of  all 
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cancer  staging  is  that  it  provides  a  static  picture  of  the  disease.  Within  each  stage  are 
cases  with  differing  biological  potential  and  speed  of  progression  [Donegan  1997], 

Tumor  size,  measured  as  the  largest  dimension  or  diameter  of  the  primary  tumor, 
is  second  only  to  axillary  lymph  node  status  as  an  independent  prognostic  factor 
[Donegan  1997].  It  is  directly  related  to  an  increasing  probability  of  regional  metastasis, 
an  increasing  average  number  of  involved  axillary  lymph  nodes,  and  an  increasing 
probability  of  recurrence  and  death.  Studies  indicate  that  tumors  of  equal  size  are 
prognostically  similar  whether  they  are  palpable  or  not  and  independent  of  their  method 
of  detection  [Tabar  1987,  Pagana  1989].  Tumors  1.0  cm  or  less  in  diameter  have  an 
especially  low  risk  of  recurrence.  Several  studies  have  reported  5-year  or  10-year 
disease-free  survival  exceeding  90  percent  for  node-negative  patients  with  tumors  1.0  cm 
or  less  in  diameter  [O’Reilly  1990,  Merkel  1993,  Rosen  1993]. 

Another  feature  of  invasive  ductal  and  lobular  breast  carcinomas  that  has 
prognostic  value  is  histologic  grade.  Histologic  grade  is  a  measure  of  intrinsic  malignant 
characteristics  of  the  tumor  including  the  degree  of  tubule  formation,  number  of  mitoses, 
and  nuclear  pleomorphism  in  routine  sections  of  breast  tissue  [Donegan  1997].  This 
information  is  used  to  assign  a  grade  indicating  the  degree  of  tumor  differentiation 
ranging  from  well  differentiated  (low  grade),  through  moderately  differentiated,  to  poorly 
differentiated  (high  grade).  The  degree  of  differentiation,  in  turn,  is  a  morphologic 
indicator  of  tumor  aggressiveness,  with  highly  differentiated  (i.e.,  low  grade)  tumors 
being  less  aggressive  [Donegan  1997].  Histologic  grade  correlates  with  breast  cancer 
patient  survival,  with  high  grade  cancers  having  the  lowest  survival  probabilities  [Henson 
1991].  This  relationship  persists  in  spite  of  interobserver  and  intraobserver  variation 
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among  pathologists  grading  breast  cancer  [Henson  1991],  and  even  after  the  lymph  node 
status  of  patients  is  taken  into  account  [Fisher  1993,  Game  1994]. 

A  variety  of  proteins  involved  in  cellular  differentiation,  proliferation,  and 
invasion  are  differentially  expressed  in  neoplastic  and  normal  breast  epithelium.  The 
most  widely  recognized  among  these  are  the  estrogen  (ER)  and  progesterone  (PR) 
hormone  receptors.  Levels  of  ER  and  PR  proteins  in  breast  tumor  tissue  have  undergone 
intensive  study  both  as  indicators  of  prognosis  and  as  predictors  of  response  to  hormone 
and  endocrine  therapy  [Donegan  1997,  Osborne  1998].  These  receptors  are  polypeptides 
that  bind  their  respective  hormones,  translocate  to  the  nucleus,  and  induce  specific  gene 
expression  [ASCO  1996].  PR  is  expressed  only  after  transcriptional  activation  of  its  gene 
by  a  functional  ER-estrogen  complex.  ER  positive  or  PR  positive  tumors  are  correlated 
with  favorable  prognostic  features  including  evidence  of  tumor  cell  differentiation  (i.e., 
low-grade  histology)  and  a  lower  rate  of  cell  proliferation  [Mohla  1982,  Pegoraro  1986, 
Dhingra  1996,  Osborne  1998].  Tumors  that  are  positive  for  ER  generally  have  a  low  S- 
phase  fraction,  indicating  that  a  low  percentage  of  tumor  cells  are  in  the  proliferation 
phases  of  the  cell  cycle  [ASCO  1996,  Donegan  1997,  Beckmann  1997,  Landberg  1997, 
Osborne  1998,  Ravaioli  1998].  ER  and  PR  levels  have  been  widely  used  by  oncologists 
to  predict  the  likelihood  of  recurrent  disease,  although  data  supporting  this  use  are 
inconsistent  [ASCO  1996,  Femo  1998].  A  recent  review  of  published  studies  indicates 
that  ER  status  and  PR  status  are  probably  more  reflective  of  tumor  growth  rate  than  of 
metastatic  potential  [Donegan  1997].  The  measurement  of  ER  and  PR  is  most  useful  for 
predicting  response  to  hormonal  therapy  [ASCO  1996,  Dhingra  1996,  Osborne  1998], 
Tumors  that  express  both  ER  and  PR  have  the  greatest  benefit  from  hormonal  therapy. 
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but  those  containing  only  ER  or  only  PR  still  have  significant  responses. 

Breast  cancer  is  highly  treatable  by  surgery,  radiation  therapy,  chemotherapy,  and 
hormonal  therapy.  Selection  of  therapy  is  influenced  by  the  tumor  stage;  pathologic 
characteristics  of  the  primary  tumor,  including  ER  and  PR  levels  and  lymph  node 
involvement;  menopausal  status;  patient  age;  and  general  health  [PDQ  1999].  A 
summary  of  current  treatment  options  by  tumor  stage,  based  on  information  from  the 
National  Cancer  Institute’s  comprehensive  cancer  database  [PDQ  1999],  appears  in 
Table  I-3a-d. 


2.  Breast  Cancer  Etiology 

Considerable  experimental,  clinical,  and  epidemiologic  research  aimed  at 
clarifying  the  etiology  of  breast  cancer  indicates  that  hormones  play  a  major  role  [Kelsey 
1990,  Le  Marchand  1991,  Habel  1993,  Henderson  1996,  Beckmann  1997],  The  known 
risk  factors  can  be  thought  of  in  terms  of  their  influence  on  cumulative  exposure  of  breast 
tissue  to  estrogen  and  perhaps  progesterone  [Pike  1993,  Henderson  1996].  Endogenous 
and  exogenous  hormones  appear  to  affect  the  expression  of  oncogenes  and  tumor- 
suppressor  genes  directly  by  altering  promoter  activity  and  indirectly  by  influencing  the 
proliferation  rate  of  breast  epithelial  cells  [Beckmann  1997].  The  activation  of 
oncogenes  and  inactivation  of  tumor-suppressor  genes  produces  a  series  of  genetic 
changes  that  are  believed  to  lead  to  malignancy  [Pike  1993,  Henderson  1996,  Landberg 
1997]. 


The  most  established  risk  factors  for  breast  cancer  include  family  history. 
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particularly  among  1st  degree  relatives;  early  menarche;  and  late  ages  at  first  childbirth 
and  menopause.  These  factors,  however,  are  not  readily  modifiable  for  the  purpose  of 
disease  prevention.  There  is  evidence  that  menopausal  estrogen  replacement  therapy 
increases  breast  cancer  risk,  but  only  to  a  small  extent  [Brinton  1993,  Pike  1993, 
Henderson  1996,  Colditz  1998].  The  potential  effect  of  oral  contraceptives  on  risk  is 
complex  and  seems  to  be  limited  to  a  subgroup  of  recent  long-term  users,  though  a 
confounding  effect  of  increased  medical  surveillance  in  this  group  can  not  be  ruled  out 
[Malone  1993,  Collaborative  Group  1996].  The  question  of  whether  dietary  intake  of  fat 
plays  a  role  in  the  development  of  breast  cancer  has  been  the  focus  of  many  ecological, 
migrant,  prospective  cohort,  case-control  and  experimental  studies  [Greenwald  1999, 
Hunter  1999].  This  factor  would  be  more  amenable  to  change,  but  the  analytic 
epidemiological  studies  generally  do  not  support  an  association  [Holmes  1999].  Obesity 
in  post-menopausal  women  has  been  linked  with  mortality  from  breast  cancer  due,  in 
part,  to  delayed  diagnosis  [Mohle-Boetani  1988,  Hunter  1993,  Hulka  1994,  Yong  1996, 
Jones  1 997]  and  to  a  worse  prognosis  that  is  independent  of  the  stage  of  disease  [Tretli 
1990,  Senie  1992].  Available  data,  however,  suggest  that  obesity  can  account  for  only 
weak  or  moderate  elevations  in  risk  [Le  Marchand  1991,  Harris  1992,  Henderson  1996]. 
Numerous  studies  have  linked  moderate  to  heavy  alcohol  intake  with  increases  in  breast 
cancer  risk  [Rosenberg  1993,  Longnecker  1995,  Henderson  1996],  but  the  proportion  of 
breast  cancer  cases  attributable  to  alcohol  consumption  in  the  United  States,  assuming 
causality,  is  estimated  to  be  quite  small  [Longnecker  1999].  Findings  for  light  to 
moderate  alcohol  consumption  are  inconsistent  and  positive  studies  indicate  only  a  very 
slight  increase  in  risk  [Longnecker  1995,  Zhang  1999a,  Zhang  1999b].  Several  recent 
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epidemiologic  studies  have  suggested  that  physical  activity  is  related  to  a  reduced  risk  for 
breast  cancer,  but  the  magnitude  of  the  effect  is  unclear,  the  underlying  biologic 
mechanisms  remain  unexplained,  and  confounding  and  effect  modification  by  other 
factors  can  not  be  ruled  out  [Brinton  1998,  Friedenreich  1998]. 

In  summary,  since  we  do  not  know  how  to  effectively  prevent  this  major  cause  of 
female  cancer  mortality,  control  strategies  emphasize  the  early  detection  and  treatment  of 
breast  tumors  before  they  have  reached  an  advanced  stage.  A  high  quality  mammogram 
with  a  clinical  breast  exam  is  the  most  effective  way  to  detect  breast  cancer  early,  when  it 
is  most  treatable  [Senie  1994].  There  is  not  universal  agreement,  however,  on  the  age  at 
which  screening  mammography  should  begin.  The  National  Cancer  Institute  [NCI  1997] 
and  the  American  Cancer  Society  [ACS  1997]  have  accepted  the  March  1997 
recommendations  of  the  National  Cancer  Advisory  Board  stating  (with  one  dissenting 
vote)  that:  I)  Women  aged  40  and  older  should  be  screened  every  one  to  two  years  with 
mammography;  and,  2)  Women  who  are  at  higher  than  average  risk  of  breast  cancer 
should  seek  expert  medical  advice  about  whether  they  should  begin  screening  before  age 
40  and  about  the  frequency  of  screening.  Members  of  a  Consensus  Development  Panel 
on  mammography  sponsored  by  the  National  Institutes  of  Health  in  January  1997  [NIH 
1997]  concluded  that  ‘‘the  available  data  did  not  warrant  a  single  recommendation  for  all 
women  in  their  forties”  and  that  women  in  this  age  group  should  make  their  own  decision 
in  consultation  with  health  professionals.  The  U.S.  Preventive  Services  Task  Force  has 
also  not  recommended  routine  mammograms  for  average-risk  women  in  their  40s 
[USPSTF  1996].  Early  detection  and  treatment  efforts  in  the  United  States  have  achieved 
only  limited  success,  however,  since  recent  mortality  declines  are  modest  and  are  not 
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comparable  in  all  segments  of  the  population  [Ries  1998]. 

B.  Significance  of  this  Study 

Survival  from  breast  cancer  among  women  in  the  United  States  varies  by 
racial/ethnic  group.  These  survival  differences  often  persist  after  stage  of  disease  at  the 
time  of  diagnosis  is  taken  into  account.  Proposed  explanations  for  this  disparity  in 
survival  include  racial/ethmc  differences  in  socioeconomic  position  and/or  differences  in 
the  biological  characteristics  of  breast  tumors.  Results  from  the  few  studies  that  have 
examined  these  factors  concurrently  are  inconsistent. 

Although  several  studies  report  that  socioeconomic  factors  explain  a  large  part  of 
the  racial/ethnic  differences  in  breast  cancer  survival,  evidence  for  an  additional  effect 
due  to  racial/ethnic  differences  in  the  biological  characteristics  of  breast  tumors  is 
inconclusive.  The  three  largest  studies  looking  at  racial/ethnic  patterns  of  socioeconomic 
factors  and  tumor  biology  included  only  White  and  Black  women  [Chen  1994,  Gordon 
1995,  Elmore  1998].  Two  additional  studies  included  Hispanic  women  [Weiss  1995]  and 
Asian  women  [Kneger  1997a],  but  their  populations  were  too  small  to  draw  reliable 
conclusions.  Two  of  the  three  large  studies  were  hospital-based,  and  one  of  these 
accrued  its  study  group  from  patients  participating  in  clinical  trials.  Since  various 
selection  factors  may  have  influenced  whether  breast  cancer  patients  were  included  in 
these  two  investigations,  their  findings  may  not  be  generalizable. 

Thus,  there  is  a  need  for  population-based  studies  with  larger  study  populations 
and  more  diverse  racial/ethnic  groups.  A  larger  study  size  will  provide  additional  power 


for  developing  reliable  estimates  of  the  effect  of  racial/ethnic  group  and  socioeconomic 
position  on  breast  cancer  outcomes.  It  will  also  improve  our  ability  to  detect  differences 
in  the  patterns  of  various  tumor  characteristics  (e.g.,  stage,  size,  grade,  hormone  receptor 
status)  across  racial/ethnic  groups  and  socioeconomic  levels. 

C.  Literature  Review 

1.  Survival  Differences  by  Race/Ethnicity 

Survival  rates  among  breast  cancer  patients  are  known  to  vary  by  racial/ethnic 
group.  Data  from  population-based  cancer  registries  in  the  United  States  have 
consistently  reported  that  Black  women  have  poorer  survival  than  Whites  [Axtell  1978, 
NIH  1980,  Le  Marchand  1984,  Young  1984,  Vernon  1985,  Bain  1986,  Baquet  1986, 
Ragland  1991,  Elledge  1994,  Simon  1996,  Meng  1997,  Ries  1998].  Two  studies  of  the 
survival  experience  of  women  from  five  major  racial/ethnic  groups  in  Hawaii  found  that 
native  Hawaiian  and  Filipino  women  had  a  higher  risk  of  dying  within  five  years 
following  a  breast  cancer  diagnosis  than  women  from  other  racial/ethnic  groups  [Le 
Marchand  1984,  Meng  1997].  Japanese  and  Chinese  women  had  the  highest  five-year 
survival  probabilities,  followed  by  Whites  among  patients  diagnosed  between  1960  and 
1979  [Le  Marchand  1984].  Findings  were  similar  for  a  later  series  of  patients  diagnosed 
between  1980  and  1988  [Meng  1997]. 

Differential  proportions  of  more  advanced  disease  at  the  time  of  diagnosis  plays  a 
role  in  the  racial/ethnic  disparities,  though  survival  differences  often  persist  after  stage  of 
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disease  is  taken  into  account  [Le  Marchand  1984,  Vernon  1985,  Le  Marchand  1985, 
Bain  1986,  Samet  1987,  Ragland  1991,  Elledge  1994,  Simon  1996,  Meng  1997,  Ries 
1998,  Wojcik  1998].  A  study  of  ten-year  survival  rates  by  race/ethnicity  and  stage 
included  1,983  breast  cancer  patients  treated  at  M.D.  Anderson  Hospital  and  Tumor 
Institute  in  Houston,  Texas  between  1949  and  1968  [Vemon  1985].  Black  women  were 
found  to  have  poorer  survival  than  either  White  or  Hispanic  women,  whose  survival 
experience  was  similar.  The  racial/ethnic  differences  in  survival  remained  after  age, 
stage  of  disease  at  diagnosis,  and  delay  in  seeking  treatment  were  taken  into  account.  A 
more  recent,  multi-center,  hospital-based  study  of  breast  cancer  patients  found  that 
overall  five-year  survival  among  Black  and  Hispanic  women  was  significantly  worse 
than  for  Whites  [Elledge  1994],  Minority  women  were  more  likely  to  present  with 
clinically  advanced  disease.  Within  stage,  however,  Hispanic  and  White  women  had 
comparable  five-year  survival  rates,  while  the  prognosis  for  Black  women  remained 
worse  than  for  the  other  groups. 

Statistics  reported  by  the  National  Cancer  Institute  for  women  diagnosed  with 
breast  cancer  in  the  population-based  cancer  registries  comprising  the  Surveillance, 
Epidemiology  and  End  Results  Program  (SEER)  indicate  that  Black  women  have  poorer 
five-year  relative  survival  rates  than  White  women  within  every  stage  of  disease  [Ries 
1998].  In  metropolitan  Atlanta,  which  is  one  of  the  SEER  cancer  registration  areas, 
survival  rates  were  compared  among  2,322  White  and  536  Black  female  residents  with  a 
diagnosis  of  primary  breast  cancer  between  January  1978  and  December  1982  and 
followed  through  the  end  of  1983  [Bain  1986].  Black  women  in  this  study  group  were 
more  likely  to  be  diagnosed  at  an  advanced  stage  and  were  less  likely  to  receive  surgical 
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treatment.  However,  even  when  the  type  of  surgery  and  stage  of  disease  were  controlled 
in  the  analysis,  racial/ethnic  group  remained  as  a  significant  prognostic  indicator  for 
survival.  Similar  studies  conducted  by  the  SEER  registries  in  San  Francisco/Oakland 
[Ragland  1991]  and  metropolitan  Detroit  [Simon  1996]  reported  that  Black  female  breast 
cancer  survival  was  poorer  than  that  of  White  females  at  each  stage  of  disease. 
Racial/ethnic  differences  were  greatest  for  regional  disease. 

A  study  of  Hispanic  and  non-Hispanic  Whites  residing  in  New  Mexico  and 
American  Indians  residing  in  New  Mexico  and  Arizona  compared  survival  rates  in  these 
groups  for  incident  cancer  cases  diagnosed  from  1969  through  1982  [Samet  1987]. 
American  Indians  were  found  to  have  significantly  poorer  one-year  and  five-year 
survival  after  a  breast  cancer  diagnosis  than  non-Hispanic  Whites,  even  after  adjustment 
for  stage  and  treatment-  Hispanic  Whites,  initially  showed  lower  survival  than  non- 
Hispanic  Whites,  but  this  difference  disappeared  after  adjustment  for  stage  and  treatment. 

In  the  studies  of  racial/ethnic  differences  in  Hawaii,  cited  earlier,  adjustment  for 
stage  of  disease  at  diagnosis  reduced  breast  cancer  survival  differences  among  Japanese, 
Chinese  and  White  women  to  statistically  non-significant  levels  [Le  Marchand  1984, 
Meng  1997].  Five-year  survival  rates  among  Filipino  and  native  Hawaiian  women 
remained  lower  than  the  other  groups,  but  were  reduced  after  adjustment  for  stage. 
Similar  results  were  found  when  follow-up  for  one  of  the  study  groups  was  extended  to 
ten  years  [Le  Marchand  1985].  Among  cases  diagnosed  with  localized  disease,  Filipino 
women  in  Hawaii  had  nearly  a  three-fold  greater  risk  of  dying  within  five  years,  while 
White  women  and  Hawaiian  women  had  an  almost  two-fold  higher  risk  of  dying  than 
Japanese  women  [Meng  1997].  For  advanced  disease,  defined  as  regional  or  distant 


12 


stage,  Hawaiian  women  had  a  two- fold  higher  risk  of  dying  than  Japanese  women.  The 
combination  of  stage  at  diagnosis  and  marital  status  explained  about  45%  of  the 
racial/ethnic  differences  in  survival  in  their  study  group  [Meng  1997].  Married  patients 
in  their  study  had  the  longest  survival,  a  finding  that  has  been  previously  reported  among 
other  cancer  patients  [Goodwin  1987].  Goodwin  et  al.  studied  over  27,000  epithelial 
cancers  diagnosed  from  January  1969  through  December  1982  among  residents  of  the 
state  of  New  Mexico  and  found  that  unmarried  persons  were  more  likely  to  be  untreated 
for  their  cancer.  After  adjustment  for  stage  distribution  and  treatment,  unmarried  persons 
still  had  poorer  survival. 

Breast  cancer  survival  was  recently  examined  by  race/ethnicity  and  other  factors 
in  a  review  of  records  maintained  by  the  Department  of  Defense  Central  Tumor  Registry 
[Wojcik  1998].  The  study  group  included  698  Black  women  and  6,577  White  women 
diagnosed  with  breast  cancer  between  1975  and  1994  and  treated  in  the  U.S.  military 
equal-access  medical  care  system.  After  adjustment  for  age  at  diagnosis,  the  risk  of 
death  was  1.45  times  greater  for  Black  women  than  for  White  women.  The  risk  only 
declined  to  1.41  after  adjustment  for  stage  of  disease  at  diagnosis  and  remained 
statistically  significant.  Additional  covariates  included  waiting  time  between  diagnosis 
and  first  treatment,  marital  status,  alcohol  usage,  tobacco  usage,  and  family  history  of 
cancer;  but  further  adjustment  for  these  factors  had  no  effect  on  their  findings.  The 
authors  concluded  that  potential  differences  in  tumor  biology,  socioeconomic  status,  or 
sociocultural  factors  may  be  contributing  to  the  survival  differences  they  noted. 
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2.  Survival  Differences  by  Socioeconomic  Position 

In  several  studies,  socioeconomic  position  has  been  found  to  partially  or  entirely 
explain  racial/ethnic  survival  differences  after  other  prognostic  factors,  such  as  stage  of 
disease  and  age  at  diagnosis,  have  been  considered  [Dayal  1982,  Bassett  1986,  Celia 
1991,  Gordon  1992,  Eley  1994,  Greenwald  1996].  Women  in  lower  socioeconomic 
groups  tend  to  have  poorer  survival  rates.  Since  race/ethnicity  is  usually  confounded 
with  socioeconomic  position  in  analytic  studies,  it  is  important  to  examine  the  influence 
of  both  factors.  Frequently,  race/ethnicity  acts  as  a  surrogate  marker  for  socioeconomic 
position  in  studies  of  risk  factors  [Gordon  1995],  though  its  use  in  this  manner  is 
imprecise  and  potentially  misleading  [Harvard  1996]. 

In  a  study  of  survival  patterns  among  breast  cancer  patients  seen  at  the  Medical 
College  of  Virginia  between  1968  and  1977,  socioeconomic  information  on  the  census 
tract  of  residence  was  available  for  a  subset  of  the  study  group  (117  White  and  206  Black 
women)  [Dayal  1982].  Each  of  the  six  socioeconomic  indicators  used  in  the  study  had  a 
significant  association  with  survival  time.  Age  and  stage  at  diagnosis  did  not  explain 
survival  differences  between  the  two  groups,  but  adjustment  for  socioeconomic  position 
reduced  the  racial/ethnic  disparity  to  a  statistically  non-significant  level.  In  contrast  to 
population-based  studies,  however,  Dayal  et  al  found  that  Black  women  in  their  study 
presented  at  an  earlier  stage  than  White  women.  This  may  be  the  result  of  selection  bias 
with  respect  to  the  types  of  patients  being  treated  at  the  study  hospital. 

In  a  larger  study  using  a  cancer  surveillance  system  covering  13  counties  in 
western  Washington  state,  socioeconomic  data  for  census  block  groups  (subunits  of 


14 


census  tracts)  was  used  to  characterize  the  socioeconomic  level  of  women  diagnosed  with 
breast  cancer  between  January  1973  and  December  1983  [Bassett  1986],  Survival 
patterns  among  251  Black  women  and  1,255  White  women  were  examined  using  a  Cox 
regression  model  to  adjust  for  Black- White  differences  in  age,  broad  categories  of  stage 
(metastatic,  non-metastatic),  and  tumor  histology  (ductal,  lobular,  other  type).  Black 
mortality  was  about  1.4  times  that  of  Whites  after  adjustment  for  these  factors. 

Following  additional  adjustment  for  socioeconomic  level.  Black  mortality  was  only  1.1 
times  that  of  Whites  (95%  Cl:  0.8,  1.5).  In  both  groups,  lower  socioeconomic  level  was  a 
strong  predictor  of  shortened  survival.  The  investigators  suggested  that  studies  of 
Black/White  differences  in  breast  cancer  survival  may  be  incomplete  and  potentially 
misleading  if  they  do  not  jointly  consider  the  role  of  socioeconomic  position. 

Another  study  examined  survival  data  on  patients  diagnosed  between  1977  and 
1983  with  one  of  six  types  of  cancer  and  entered  into  the  treatment  protocols  of  a 
cooperative  clinical  trials  group  which  included  institutions  in  the  United  States  and 
Canada  [Celia  1991].  A  strength  of  this  study  was  that  cancer  patients  admitted  to  the 
trials  received  the  specified  treatment  regardless  of  income  or  insurance  status. 
Race/ethnicity  (White  vs.  Black)  was  not  a  significant  predictor  of  survival  time  when 
data  were  adjusted  for  differences  in  general  health  status  at  entry,  age,  and  protocol- 
specific  prognostic  factors  (estrogen  receptor  status  for  breast  cancer  patients).  Income 
and  education,  however,  were  important  factors.  Patients  with  lower  annual  incomes  and 
those  with  lower  educational  level  experienced  significantly  shorter  survival  times  than 
those  with  higher  income  or  education. 

In  a  larger  multi-center  clinical  trial  of  stage  I  and  stage  II  breast  cancer  patients 
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based  at  Case  Western  Reserve  University,  those  having  less  education  and  lower 
incomes  were  also  found  to  have  poorer  disease-free  survival  and  overall  survival 
[Gordon  1992].  These  differences  remained  after  adjustment  for  estrogen  receptor  status, 
number  of  positive  lymph  nodes,  and  tumor  size.  Racial/ethnic  group  (White,  Black)  was 
not  a  significant  determinant  of  survival  once  adjustment  was  made  for  socioeconomic 
status. 

Another  hospital-based  study  evaluated  the  importance  of  socioeconomic  status 
and  race/ethnicity  in  cancer  survival  by  pooling  information  from  22  Comprehensive 
Cancer  Centers  in  the  United  States  [Greenwald  1996].  This  data  base,  called  the 
Centralized  Cancer  Patient  Data  System,  included  6,896  breast  cancer  cases  diagnosed 
between  July  1977  and  October  1981.  Results  from  a  Cox  proportional  hazards  model 
which  included  age  at  diagnosis,  racial/ethnic  group  (Black,  White),  and  socioeconomic 
status  (percentage  of  high  school  graduates  in  the  postal  code  areas  where  patients 
resided),  indicated  that  socioeconomic  status  and  race/ethnicity  were  independent 
predictors  of  survival.  A  significant  weakness  of  this  study,  however,  was  the  lack  of 
information  on  tumor  characteristics  or  stage  of  disease. 

Noting  the  well-documented  disparity  in  cancer  survival  between  Blacks  and 
Whites,  the  National  Cancer  Institute,  in  1983,  planned  and  funded  the  Black/White 
Cancer  Survival  Study.  Breast  cancer  survival  differences  were  examined  among  612 
Black  and  518  White  women  diagnosed  in  1985  and  1986  in  one  of  three  population- 
based  registry  systems  in  Atlanta,  GA,  New  Orleans,  LA  and  San  Francisco/Oakland,  CA 
[Eley  1994].  Multivariable  modeling  using  Cox  proportional  hazards  regression  were 
used  to  estimate  the  hazard  ratio  for  Blacks  compared  to  Whites,  adjusting  for  stage  of 
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disease,  tumor  characteristics  (positive  lymph  nodes,  histologic  subtype,  pathological 
grade,  estrogen  receptor  status),  treatment  type,  comorbid  conditions,  and 
sociodemographic  factors  (e.g.,  marital  status,  occupation,  usual  source  of  health  care, 
health  insurance  status,  an  index  of  poverty).  After  controlling  for  geographic  area  of 
residence  and  age  in  the  analysis,  the  risk  of  dying  was  2.2  times  (95%  Cl:  1 .8,  2.8) 
greater  for  Blacks  than  Whites.  Adjustment  for  stage  of  disease  reduced  the  risk  to  1 .7 
(95%  Cl:  1 .4,  2.2)  and  further  adjustment  for  tumor  pathology,  treatment,  comorbidities, 
and  sociodemographic  variables  resulted  in  a  hazard  ratio  comparing  Blacks  to  Whites  of 
1 .3  (95%  Cl:  1 .0,  1 .8).  Their  results  were  similar,  whether  analyzing  all-cause  mortality 
or  breast  cancer-specific  mortality.  The  authors  concluded  that  about  40%  of  the 
racial/ethnic  difference  in  survival  was  explained  by  more  advanced  stage  of  disease 
among  Blacks,  another  15%  by  histologic  and  pathologic  differences,  and  a  further  18% 
by  the  amount  of  comorbid  illness  and  sociodemographic  factors.  They  recommended 
that  future  efforts  to  reduce  racial/ethnic  differences  in  survival  be  aimed  at  early 
recognition  of  disease  by  means  of  community  education,  improved  access  to  primary 
care  and  mammography,  and  increased  compliance  with  screening  recommendations. 

3.  Biological  Tumor  Markers,  Race/Ethnicity,  and  Socioeconomic 
Position 

Several  studies  have  documented  the  pattern  of  a  poorer  breast  cancer  clinical 
stage  distribution  among  persons  in  lower  socioeconomic  groups  [Ownby  1985, 

Polednak  1986,  Farley  1989,  Mandelblatt  1991,  Wells  1992,  Weiss  1995,  Bentley  1998, 
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Lannin  1998].  Since  it  is  unclear,  however,  whether  socioeconomic  factors  can  entirely 
account  for  the  racial/ethnic  differences  in  breast  cancer  survival,  potential  differences  in 
biological  characteristics  of  the  tumors  have  been  studied  by  a  number  of  investigators. 
Biological  markers  for  breast  tumors  may  have  prognostic  value:  providing  information 
on  the  expected  clinical  outcome  of  the  malignancy;  and/or  have  predictive  value: 
indicating  those  patients  likely  to  benefit  from  adjuvant  systemic  chemo-  or  hormonal 
therapy  [Von  Kleist  1996].  Studies  of  prognostic  factors  measure  biological 
characteristics  inherent  to  the  breast  tumor,  such  as  tumor  cell  proliferation,  tumor 
aggressiveness,  and  its  potential  for  metastasis  [Von  Kleist  1996,  Femo  1998]. 

Several  available  tumor  marker  tests  were  recently  evaluated  for  their  utility  in 
the  prevention,  screening,  treatment  and  surveillance  of  breast  cancers  by  a  Tumor 
Marker  Expert  Panel  convened  by  the  American  Society  of  Clinical  Oncology.  The 
Panel  developed  a  set  of  clinical  practice  guidelines,  based  on  their  review  of  the 
published  studies  [ASCO  1996].  They  determined  that  the  receptor  proteins  for  estrogen 
and  progesterone  should  be  measured  on  every  primary  breast  cancer  specimen.  The 
Panel  further  concluded  that  the  data  were  insufficient  to  recommend  routine  use  of  the 
other  markers  they  considered  in  their  review,  namely:  carcinoembryonic  antigen  (CEA), 
cancer  antigen  (CA  15-3),  proliferative  markers  (DNA  index  or  S-phase  fraction),  a 
marker  of  tumor  invasion  (cathepsin-D),  a  proto-oncogene  (HER-2/neu),  and  a  tumor 
suppressor  gene  (p53). 

Racial/ethnic  differences  in  the  distribution  of  estrogen  receptor  (ER)  status 
among  breast  cancer  patients  have  been  documented  in  several  studies  [Mohla  1982, 
Hulka  1984,  Ownby  1985,  Pegararo  1986,  Beverly  1987,  Stanford  1987,  Stanford  1989, 
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Chen  1994,  Ellege  1994,  Gordon  1995,  Gapstur  1996,  Elmore  1998].  Most,  but  notail, 
of  these  studies  were  hospital-based  or  included  only  patients  in  clinical  trials,  so  their 
findings  may  not  be  generalizable.  Few  adjusted  for  potential  confounding  variables 
such  as  age  and  socioeconomic  position. 

Information  concerning  the  distribution  of  ER  and  PR  by  race/ethnicity  and 
socioeconomic  position  is  limited.  Findings  from  three  recent  studies  examining  the 
relationship  between  Black/White  differences  in  hormone  receptor  status  and 
socioeconomic  position  are  conflicting  [Chen  1994,  Gordon  1995,  Elmore  1998].  In  the 
cross-sectional  study  by  Chen  et  al.,  data  on  tumor  characteristics  and  socioeconomic 
variables  were  collected  from  medical  records  and  in-person  interviews  with  patients 
diagnosed  in  1985  and  1986.  Study  subjects  (n=506  Black  and  457  White  women)  were 
identified  from  population-based  cancer  registries  in  three  urban  areas  (Atlanta,  New 
Orleans,  San  Francisco-Oakland).  Black  women  in  this  study  were  more  likely  than 
Whites  to  have  tumors  that  were  ER-negative,  poorly  differentiated,  with  increased 
nuclear  atypia,  and  more  necrosis.  With  the  exception  of  ER  status,  these  associations 
remained  statistically  significant  after  controlling  for  age,  geographic  area, 
socioeconomic  status,  body  mass  index,  use  of  alcohol  and  tobacco,  reproductive 
experience,  and  health  care  access  and  utilization.  Since  the  social  and  lifestyle  factors 
of  the  study  group  did  not  entirely  explain  racial/ethnic  differences  in  tumor 
characteristics  associated  with  a  poor  prognosis,  Chen  et  al.  concluded  that  biological 
reasons  for  the  racial/ethnic  differences  must  be  further  explored. 

The  study  by  Gordon  was  based  on  newly-diagnosed  breast  cancer  patients  from 
northeastern  Ohio  who  participated  in  one  of  three  clinical  trials  during  two  time  periods: 
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1974  to  1985  (n=164  Black  and  723  White  women)  and  1986  to  mid-1992  (n=167  Black 
and  437  White  women).  Since  socioeconomic  information  was  not  obtained  from 
individuals  for  this  study,  surrogate  measures  were  used  based  on  characteristics  of  the 
census  tract  of  residence  at  the  time  of  diagnosis.  Gordon  found  that  ER-negative  tumors 
were  associated  with  low  socioeconomic  level  after  controlling  for  race/ethnicity,  age, 
and  other  patient  characteristics.  This  relationship  held  for  each  of  the  time  periods. 
Gordon  concluded  that  the  poorer  prognosis  of  lower  socioeconomic  women  might  be 
explained  by  their  less  favorable  ER  status. 

The  hospital-based  study  by  Elmore  et  al.  included  100  Black  and  300  White 
patients  diagnosed  with  breast  cancer  at  the  Yale-New  Haven  Hospital  from  January 
1985  through  December  1993.  Clinical  and  sociodemographic  information  was  collected 
from  each  patient.  In  contrast  to  earlier  studies,  no  racial/ethnic  difference  was  noted  for 
ER  status  in  this  study  population.  Black  patients  had  increased  age-adjusted  odds  ratios 
for  several  tumor  characteristics  that  have  been  associated  with  a  worse  prognosis 
including,  higher  stage  of  disease,  larger  tumor  size,  positive  lymph  nodes,  presence  of 
necrosis,  vascular/lymphatic  invasion,  and  negative  PR  status.  Further  adjustment  for 
income,  medical  insurance  status,  and  method  of  detection,  however,  reduced  the 
observed  associations  and  only  tumor  size  and  necrosis  remained  statistically  significant. 
The  investigators  concluded  that  the  majority  of  histologic  features  of  breast  cancer 
measured  in  this  study  did  not  differ  between  Black  and  White  patients.  The  significant 
differences  in  tumor  size  and  necrosis  suggested  that  a  true  biologic  difference  may  exist, 
but  confirmation  is  needed  in  larger  studies. 

Other  investigators  have  reported  no  significant  racial/ethnic  differences  [Weiss 
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1995,  Krieger  1997a]  or  socioeconomic  differences  [Krieger  1997a]  in  the  distribution  of 
hormone  receptors  and  other  molecular  biomarkers  (e.g.,  oncogenes,  cytoplasmic 
proteins,  markers  of  cell  growth)  among  breast  cancer  patients.  The  population  sizes  in 
these  studies  were  too  small,  however,  to  draw  reliable  conclusions  (Krieger:  n=44 
Black,  44  White,  43  Asian  women;  Weiss:  n=32  Black,  172  White,  49  Hispanic  women). 


CHAPTER  II.  MATERIALS  AND  METHODS 


A.  General  Study  Design 

A  population-based,  case-control  design  was  chosen  to  evaluate  the  importance  of 
socioeconomic  position  in  explaining  racial/ethnic  differences  in  tumor  characteristics 
among  women  newly  diagnosed  with  invasive,  primary  breast  cancer  during  the  years 
1992  through  1996  in  any  of  the  eleven  reporting  areas  comprising  the  Surveillance, 
Epidemiology,  and  End  Results  (SEER)  program  of  the  National  Cancer  Institute  (NCI). 
Specific  tumor  characteristics  at  the  time  of  diagnosis  including  stage,  size,  grade, 
estrogen  receptor  and  progesterone  receptor  status,  and  a  limited  set  of  demographic 
variables  were  collected  for  individual  study  subjects.  Socioeconomic  variables  were 
extracted  from  the  1990  decennial  census  data  file  and  linked  to  individual  patient 
records  to  provide  census  tract-level  information. 

The  outcome  variables  in  this  study  (tumor  stage,  tumor  size)  were  coded  as 
dichotomous  variables.  The  analysis  included  descriptive  tabulations  of  the  study 
variables  by  racial/ethnic  group  and  preliminary  two-way  comparisons  between  selected 
explanatory  variables  and  outcome  variables.  Logistic  regression  models  were  used  to 
estimate  the  relative  importance  of  race/ethnicity  and  various  socioeconomic  measures  in 
explaining  the  stage  of  disease  and  tumor  size  at  the  time  of  diagnosis. 
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B.  Study  Aims 

The  specific  aims  of  this  study  were: 

■  To  describe  the  racial/ethnic  distribution  of  selected  demographic, 
socioeconomic,  and  tumor  characteristics  (stage  of  disease,  tumor  size,  tumor 
grade,  estrogen/progesterone  receptor  status)  that  influence  prognosis  for  cancer 
of  the  female  breast. 

■  To  assess  the  importance  of  sociodemographic  factors  in  explaining  racial/ethnic 
differences  in  tumor  stage  and  size  at  the  time  of  diagnosis. 

C.  Feasibility  Assessment  and  Geocoding  Improvements 

Prior  to  conducting  the  study,  I  completed  the  following  preliminary  activities  to 
evaluate  the  feasibility  of  the  project  and  to  improve  the  quality  of  the  data: 

■  I  evaluated  the  completeness  and  accuracy  of  geocoded  information  on  residence 
at  the  time  of  diagnosis  for  cancer  patients  in  the  SEER  program  cancer  data  base. 
An  edit  check  for  valid  census  tract  codes  had  never  previously  been  conducted 
on  the  entire  data  base  and  was  necessary  to  determine  the  feasibility  of  linking 
selected  socioeconomic  variables  from  the  1990  decennial  census,  at  the  census 
tract-level,  with  individual  cancer  records  from  all  of  the  SEER  reporting  areas. 


The  percentage  of  valid  census  tract  codes  found  in  this  edit  ranged  from  60%  to 
95%  by  registry.  I  reported  these  results  to  each  of  the  1 1  SEER  registries. 

I  created  electronic  files  containing  valid  1990  census  tract  codes  and  sent  them  to 
each  of  the  registries.  The  registry  staff  then  edited  and  recoded  the  census  tract 
fields  for  their  cancer  patients  and  sent  corrected  data  files  to  the  NCI. 

I  developed  a  survey  form,  with  assistance  from  other  NCI  staff,  to  collect 
information  from  each  of  the  SEER  registries  on  their  current  geocoding 
procedures,  any  problems  they  encounter,  and  the  associated  costs  (see  data 
collection  instrument.  Appendix  II-la-c).  I  summarized  the  results  of  this  survey 
and  presented  them  at  a  special  meeting  with  all  of  the  registry  data  managers 
(see  below). 

I  planned  and  chaired  a  special  section  of  the  annual  SEER  data  managers’ 
meeting  in  Bethesda,  MD  in  October  1998.  The  purpose  was  to  exchange 
information  on  geocoding  procedures  and  to  find  ways  to  improve  the 
completeness  and  accuracy  of  geocoding  in  each  of  the  registries. 

On  the  basis  of  what  I  found  from  the  special  data  edits  and  learned  from  the  data 
managers’  meeting,  I  drafted  new  data  reporting  requirements  aimed  at  improving 
the  collection  and  geocoding  of  residence  information.  These  requirements  were 
incorporated  as  revisions  in  the  SEER  coding  manual. 
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■  I  developed  a  new,  global  reporting  rule  that  requires  all  SEER  registries  to 
provide  a  variable  that  indicates  the  completeness  of  address  information  used  in 
assigning  the  census  tract  code.  This  variable  will  help  data  analysts  to  assess  the 
validity  of  the  census  tract  codes  in  future  studies. 

■  The  data  collection  changes  described  above  are  documented  as  revisions  to  the 
current  SEER  Program  Code  Manual  (Appendix  II-2a-c). 

As  a  result  of  these  efforts,  the  percentage  of  cases  that  received  a  valid  census 
tract  code  increased  to  96%,  overall. 

D.  Study  Population 

The  targeted  study  population  included  all  women  with  newly  diagnosed  primary 
breast  cancers  reported  among  women  living  in  any  of  the  eleven  cancer  registration 
areas  in  the  SEER  program  of  the  National  Cancer  Institute  during  1992  through  1996. 
The  SEER  registries  were  originally  chosen  for  their  ability  to  operate  and  maintain 
population-based  cancer  surveillance  systems  and  for  the  characteristics  and  size  of  the 
population  subgroups  (e.g.,  racial/ethnic  groups,  urban/rural  populations)  within  their 
reporting  areas.  The  SEER  geographic  regions  included  in  this  study  and  the  number  of 
counties  and  census  tracts  they  cover  are  identified  in  Table  n-1.  The  locations  include 
the  States  of  Connecticut,  Hawaii,  Iowa,  New  Mexico,  and  Utah;  and  the  metropolitan 
areas  of  Atlanta,  Detroit,  Los  Angeles,  San  Francisco  and  Oakland,  San  Jose  and 
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Monterey,  and  Seattle. 

These  areas  cover  about  14%  of  the  total  United  States  population,  and  include 
78%  of  the  Hawaiian  population,  60%  of  the  Japanese  population,  49%  of  the  Filipino 
population,  43%  of  the  Chinese  population,  34%  of  the  Korean  population,  3 1%  of  the 
Vietnamese  population,  27%  of  the  American  Indian  population,  and  25%  of  the 
Hispanic  population  of  this  country.  Selected  demographic  characteristics  of  the  overall 
population  covered  by  the  eleven  SEER  registries  are  compared  with  those  for  the 
general  United  States  population  in  Figure  H-l.  The  population  in  the  SEER  coverage 
areas  is  similar  to  the  general  United  States  population  with  respect  to  the  percentage  of 
people  living  below  the  poverty  level.  The  percentage  of  adults  who  graduated  from  high 
school  is  slightly  higher  in  the  SEER  areas  and  a  larger  portion  of  the  SEER  population 
lives  in  urban  areas.  Finally,  the  percentage  of  foreign-bom  persons  living  in  the  SEER 
areas  is  nearly  double  that  for  the  United  States  as  a  whole. 

There  were  126,400  women  newly  diagnosed  with  either  in  situ  or  invasive  breast 
cancer  among  residents  of  the  SEER  coverage  areas  during  the  years  1992  through  1996 
(Figure  II-2).  Limiting  the  study  to  invasive  cancers  among  the  ten  largest  racial/ethnic 
groups  results  in  a  potential  study  group  of  107,206  breast  cancer  patients  among 
Hispanics;  and  non-Hispanic  Whites,  Blacks,  American  Indians  in  New  Mexico,  Chinese, 
Japanese,  Filipinos,  Hawaiians,  Koreans,  and  Vietnamese.  Cases  that  were  identified 
only  from  an  autopsy  record  or  death  certificate  comprised  less  than  one  percent  of  the 
intended  study  population  (n=599)  and  were  excluded  since  they  do  not  have  useful 
information  on  tumor  characteristics  at  the  time  of  diagnosis.  There  were  no  notable 
differences  between  the  excluded  group  and  the  study  group  with  respect  to  racial/ethnic 
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category  or  registry. 

The  remaining  study  group  (n=106,607)  included  over  84,000  invasive  breast 
cancer  cases  among  non-Hispanic  White  women,  about  9,000  among  non-Hispanic  Black 
women,  and  over  7,000  among  Hispanic  women.  There  are  over  1,800;  1,500;  and  1,300 
cases  among  non-Hispanic  Japanese,  Filipino,  and  Chinese  women,  respectively. 

Smaller  numbers  of  cases  occurred  among  native  Hawaiian  (n=508),  Korean  (n=301), 
Vietnamese  (n=272),  and  American  Indian  (n=136)  groups. 

E.  Evaluation  of  Sample  Size 

Sample  size  calculations  were  performed  using  Epi  Info™  software  version  6.04b 
[CDC  1994]  to  indicate  the  number  of  “unexposed”  (White)  and  “exposed”  (other 
specific  racial/ethnic  group)  study  subjects  that  would  be  required  to  detect  a  given  range 
of  odds  ratios.  The  odds  ratios  reflect  the  odds  of  being  in  the  “exposed”  racial/ethnic 
group  among  “cases”  (i.e.,  in  this  example,  those  with  a  diagnosis  of  distant  stage 
disease)  relative  to  that  in  the  control  group  (i.e.,  localized  stage  disease).  The  specified 
level  of  power  is  80%  and  the  specified  probability  of  making  a  Type  I  error,  a  =  .05. 

The  expected  proportion  of  the  “unexposed”  (White)  group  with  “disease”  (distant  stage 
cancer)  is  0.054,  based  on  a  preliminary  examination  of  the  data.  Since  additional 
planned  case/control  comparisons  (e.g.,  regional  vs.  localized  disease;  and  large  tumor 
vs.  small  tumor)  include  larger  numbers  of  study  subjects  than  the  distant  vs.  local 
comparison,  these  results  represent  a  conservative  assessment  of  sample  size  and  power 
requirements  for  this  study. 
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The  calculated  sample  sizes  for  each  value  of  the  odds  ratio  (Table  II-2a,b)  may 
be  compared  to  the  available  number  of  study  subjects  in  the  various  racial/ethnic  groups. 
This  comparison  indicates  that  there  are  sufficient  numbers  of  Black  and  Hispanic  study 
subjects  to  detect  elevated  odds  ratios  as  low  as  1 .2  and  reduced  odds  ratios  as  high  as  0.8 
at  80%  power  and  an  alpha  level  of  .05.  Among  Japanese,  Filipino,  and  Chinese  women 
there  are  sufficient  study  subjects  to  detect  elevated  odds  ratios  as  low  as  1.4  and  reduced 
odds  ratios  as  high  as  0.7  for  Japanese  and  0.6  for  Filipino  and  Chinese.  The  number  of 
study  subjects  available  among  Hawaiian,  Korean  and  Vietnamese  women  will  allow  the 
detection  of  more  moderate  odds  ratios  (elevated  OR  as  low  as  1 .7  for  Hawaiian,  1 .9  for 
Korean  and  2.0  for  Vietnamese;  reduced  OR  as  high  as  0.3  for  all  groups).  Among 
American  Indian  women,  odds  ratios  equal  to  or  greater  than  2.4  or  lower  than  0. 1  will  be 
detectable  with  the  same  alpha  level  and  power. 

F.  Data  Linkage 

Several  variables  describing  the  tumor  and  basic  patient  demographics  were 
available  for  each  breast  cancer  study  subject  (Table  II-3).  Additional  demographic 
variables  relating  to  socioeconomic  position  are  available  from  the  1990  decennial  census 
[Census  1992]  for  small  geographic  areas  (census  tracts  or  block  numbering  areas) 
covering  the  SEER  areas  from  which  the  study  subjects  are  drawn.  The  census  variables 
chosen  for  this  study  (Table  II-3)  were  selected  following  a  review  of  the  published 
literature  on  the  measurement  of  associations  between  socioeconomic  position  and  health 
outcomes  [Last  1987,  Filkati  1995,  Patrick  1995,  Krieger  1997b]. 
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The  socioeconomic  variables  were  linked  to  individual  cases  on  the  basis  of  their 
residence  at  the  time  of  their  cancer  diagnosis.  This  linkage  has  never  been  attempted  on 
the  entire  SEER  program  data  base  and  is  a  unique  feature  of  this  study.  Previous  studies 
have  been  limited  to  one  or  a  few  of  the  registries  located  in  metropolitan  areas  because 
of  the  lack  of  defined  census  tracts  for  many  areas  of  the  country  in  previous  censuses 
and  because  of  the  poor  quality  of  residence  address  information  in  rural  areas. 

The  success  of  this  linkage  depended  upon  the  completeness  and  accuracy  of 
address  information  collected  on  cancer  patients,  thereby  enabling  the  assignment  of 
geocodes  (i.e.,  census  tract  or  block  numbering  area  code)  to  individual  records.  These 
geocodes  were  then  used  to  link  patient  records  with  socioeconomic  information  for 
census  tracts  or  block  numbering  areas  from  the  1990  decennial  census.  The  census  tract 
data  field,  although  collected  since  the  beginning  of  the  SEER  program,  has  not 
undergone  rigorous  data  editing  prior  to  this  study.  Preliminary  computer  edits  I 
conducted  on  the  SEER  data  file  indicated  poor  coding  of  census  tract  number  by  several 
of  the  SEER  registry  areas,  with  the  percentage  of  valid  codes  ranging  from  60%  to  95% 
of  cases.  Most  of  the  urban  SEER  areas  had  higher  percentages  of  valid  census  tract 
codes  than  did  registries  covering  entire  states. 

After  I  reported  the  edit  results  to  each  of  the  registries  they  reviewed  their 
incorrectly  coded  cases  and  were  able  to  assign  new,  valid  geocodes  to  some  of  the  study 
subjects.  The  percentage  of  valid  census  tract  codes  improved  to  80%  to  95%  in  their 
next  data  submission.  This  was  still  not  sufficient,  however,  for  the  purposes  of  my 
proposed  study.  To  help  data  managers  verify  the  accuracy  of  their  geocoding,  I 
provided  each  of  them  with  tabular  results  from  the  edits  I  had  performed  on  their  data 
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submission  and  with  electronic  files  containing  allowable  county/census  tract  codes  for 
their  coverage  areas.  From  this  effort,  I  learned  that  many  of  the  registries  were 
providing  outdated  (1980)  census  tract  codes  instead  of  the  current  (1990)  codes.  I  also 
learned  that  one  of  the  registries  was  routinely  failing  to  geocode  cases  from  several  of 
their  counties  due  to  a  mistaken  belief  that  census  tracts  had  not  yet  been  defined  for  the 
counties. 

To  obtain  detailed  information  about  the  geocoding  procedures,  associated  costs, 
and  problems  currently  experienced  by  each  of  the  registries,  I  developed  a  Geocoding 
Update  Instrument,  with  assistance  from  other  NCI  staff  (Appendix  II-la-c).  I 
summarized  the  results  from  this  survey  and  presented  them  at  a  meeting  I  convened  at 
the  NCI  in  October  1998  that  was  attended  by  all  of  the  SEER  Registry  Managers.  The 
meeting  facilitated  an  exchange  of  information  between  the  registries  and  focused  further 
attention  on  the  need  to  improve  the  completeness  and  accuracy  of  geocoding. 

Data  managers  from  a  largely  rural  state  registry  and  an  urban  registry  gave 
detailed  presentations  on  the  various  techniques  and  data  sources  they  use  for  geocoding. 
Issues  of  particular  interest  included  the  strengths  and  weaknesses  of  automated 
geocoding  software  and  the  use  of  rural  route  numbers,  post  office  boxes,  zip  code 
centroids  (for  5-digit,  7-digit  and  9-digit  zip  codes),  American  Indian  community  codes, 
census  tract  maps,  topologically  integrated  geographic  encoding  and  referencing  digital 
mapping  system  (i.e.,  TIGER  files),  crisscross  directories,  voting  records,  motor  vehicle 
administration  records  and  other  sources  to  obtain  necessary  address  and  geocode 
information.  During  the  meeting,  all  registry  directors  and  managers  became  acquainted 
with  a  variety  of  available  geocoding  techniques  and  identified  the  strengths  and 
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limitations  of  the  methods.  When  complete  street  address  and  zip  code  information  is 
available,  the  geocoding  software  is  quick,  relatively  inexpensive  and  accurate.  A  further 
advantage  is  that  the  census  block  group  (a  subunit  of  the  census  tract)  can  be  obtained 
from  the  software  in  addition  to  the  census  tract.  Block  group  coding  may  not  be  feasible 
for  cases  with  incomplete  addresses  that  require  manual  geocoding. 

Another  important  outcome  from  the  meeting  was  the  addition  of  a  new  variable 
to  the  SEER  data  base  which  I  developed  to  indicate  the  level  of  certainty  and  the 
completeness  of  address  information  used  to  assign  a  geocode  to  each  cancer  case  (e.g., 
high  certainty  =  complete  residence  address  available;  low  certainty  =  only  rural  route 
number  or  post  office  box  and  zip  code  available).  This  information  was  not  available 
for  subjects  included  in  this  study,  but  will  be  reported  for  all  new  cancer  cases 
diagnosed  on  January  1,  1998  and  thereafter.  The  new  coding  scheme  and  revisions  to 
the  SEER  Program  Code  Manual  are  reproduced  in  Appendix  II-2a-c. 

As  a  result  of  the  knowledge  gained  from  this  meeting  additional  efforts  were 
undertaken  by  each  of  the  registries  to  improve  the  geocoding  of  their  cancer  case 
records.  In  the  next  data  file  submission  from  the  registries,  the  overall  percentage  of 
study  subjects  with  valid  census  tract  codes  reached  96  percent.  This  seemed  sufficient 
to  conduct  the  proposed  study  and  was  used  as  the  final  data  analysis  file. 
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G.  Variable  Specification 

1.  Individual-level  Variables 

Cancer  type 

This  study  includes  women  newly  diagnosed,  between  January  l,  1992  and 
December  3 1 ,  1996,  with  an  invasive  cancer  of  the  breast.  This  includes  codes  C50.0 
through  C50.9  in  the  scheme  of  the  International  Classification  of  Diseases  for  Oncology, 
2nd  Edition  [ICDO-2  1990]. 

Tumor  stage 

Descriptive  information  on  the  extent  of  disease  at  the  time  of  diagnosis  was 
collected  on  all  cancer  cases.  This  information  is  based  on  a  combination  of  clinical, 
operative,  and  pathological  assessments.  If  a  discrepancy  appears  between  pathology  and 
operative  reports  concerning  excised  tissue,  priority  is  given  to  the  pathology  report.  The 
priority  for  using  information  to  code  the  extent  of  disease  is  1)  pathologic,  2)  operative 
and  3)  clinical  findings.  The  major  components  of  the  extent  of  disease  are  size  of  the 
tumor,  extension  of  the  tumor,  evidence  of  metastasis,  and  lymph  node  involvement. 

This  information  allows  the  data  to  be  collapsed  into  different  staging  schemes  and 
provides  flexibility  in  maintaining  consistency  over  time,  even  if  a  staging  scheme 
changes  [Fritz  1998]. 

Cancer  staging  is  a  method  for  grouping  patients  based  on  the  extent  of  the  spread 
of  the  cancer  from  its  site  of  origin.  Detecting  cancers  at  an  early,  more  treatable  stage  is 
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a  major  goal  of  prevention  and  control  efforts.  Knowledge  of  the  stage  of  disease  at  the 
time  of  diagnosis  is  essential  for  determining  the  choice  of  therapy  and  in  assessing 
prognosis.  Tumor  stage  is  the  strongest  measure  of  the  behavior  of  invasive  breast 
cancer  and  forms  the  basis  of  prognostication  [Simpson  1996].  The  localized-regional- 
distant  summary  staging  scheme  has  been  found  useful  over  the  years  for  descriptive  and 
statistical  analysis  of  tumor  registry  data  and  is  defined  below  [Seiffert  1993]. 

Localized:  An  invasive  malignant  neoplasm  confined  entirely  to  the  organ  of  origin 

with  no  lymph  node  involvement. 

Regional:  A  malignant  neoplasm  that  l)  has  extended  beyond  the  limits  of  the  organ 

of  origin  directly  into  surrounding  organs  or  tissues;  or  2)  involves 
regional  lymph  nodes  by  way  of  the  lymphatic  system;  or  3)  has  both 
regional  extension  and  involvement  of  regional  lymph  nodes. 

Distant:  A  malignant  neoplasm  that  has  spread  to  parts  of  the  body  remote  from 

the  primary  tumor  either  by  direct  extension  or  by  discontinuous 
metastasis  (e.g.,  implantation  or  seeding)  to  distant  organs,  tissues,  or  via 
the  lymphatic  system  to  distant  lymph  nodes. 

Unstaged:  Information  is  not  sufficient  to  assign  a  stage. 

For  this  study,  one  of  the  main  outcomes  of  interest  was  to  differentiate  breast 
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cancers  diagnosed  early  enough  in  their  natural  history  so  as  to  afford  a  meaningful 
survival  advantage.  Comparisons  of  distant  stage  disease  with  localized  disease  and 
regional  stage  with  localized  stage  are  used  because  of  the  large  difference  in  relative 
survival  rates  between  these  groups  (Table  1-1).  In  addition,  studies  of  the  increased  use 
of  mammography  in  early  detection  and  screening  have  demonstrated  increases  in  the 
detection  of  localized  lesions  [Thomas  1977].  Thus,  differences  in  the  relative  frequency 
of  localized  tumors  versus  more  advanced  stage  tumors  among  different  groups  of 
individuals  may  reflect  different  levels  of  medical  surveillance. 

Tumor  size 

Tumor  size  was  recorded  in  millimeters  and  refers  to  the  exact  size  of  the  primary 
tumor  at  its  largest  dimension.  If  the  patient  has  been  pretreated  with  neoadjuvant 
chemotherapy,  hormonal  therapy,  immunotherapy  or  radiation  therapy,  tumor  size  is  not 
coded  unless  it  was  measured  prior  to  the  initiation  of  these  therapies.  In  breast  cancer, 
the  size  of  the  invasive  component  of  the  primary  tumor  reflects  its  natural  history,  its 
metastatic  capacity,  and  is  an  independent  predictor  of  survival  [Simpson  1996].  The 
localized-regional-distant  summary  staging  scheme  does  not  explicitly  use  the  size  of  the 
tumor  in  assigning  a  stage.  Therefore,  tumor  size  was  also  evaluated  in  relation  to 
racial/ethnic  and  socioeconomic  factors.  Tumors  1.0  cm  or  less  in  diameter  have  an 
especially  low  risk  of  recurrence.  Several  studies  have  reported  5-year  or  10-year 
disease-free  survival  exceeding  90  percent  for  node-negative  patients  with  tumors  1 .0  cm 
or  less  in  diameter  [O’Reilly  1990,  Merkel  1993,  Rosen  1993].  In  addition,  tumors 
smaller  than  1 .0  cm  are  more  difficult  to  detect  by  clinical  breast  exam  [Fletcher  1985, 
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Reintgen  1993,  Helzlsouer  1995]  and  their  increased  identification  in  particular 
subgroups  of  the  study  population  may  reflect  differential  patterns  of  mammography 
screening.  In  the  data  analysis,  patients  with  tumors  l  .0  cm  in  diameter  or  greater  at  the 
time  of  diagnosis  are  compared  with  those  diagnosed  at  smaller  sizes. 

Hormone  receptor  status 

Results  of  testing  for  estrogen  receptor  and  progesterone  receptor  were  obtained 
from  medical  records  and  coded  as  positive,  negative,  borderline  or  unknown.  This 
variable  is  included  as  an  indicator  of  the  tumor  biology.  Tumors  that  are  positive  for 
hormone  receptors  tend  to  be  correlated  with  positive  prognostic  features  such  as  a  lower 
rate  of  cell  proliferation  and  evidence  of  tumor  cell  differentiation. 

Tumor  grade 

Tumors  were  classified  into  one  of  four  grades  or  unknown.  Grade  1  tumors  are 
those  considered  to  be  well-differentiated  (and  the  least  aggressive);  grade  2  corresponds 
to  moderately  differentiated;  grade  3  tumors  are  poorly  differentiated;  and  grade  4 
includes  undifferentiated  tumors  (or  highly  aggressive). 

Age  at  diagnosis 

The  age  of  the  patient  at  the  time  of  their  cancer  diagnosis  was  available  for  all 
study  subjects  and  was  measured  in  completed  years  of  life.  It  was  included  as  a 
continuous  variable  in  the  analysis. 
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Race  and  ethnicity 

Consistent  with  Office  of  Management  and  Budget  federal  data  standards,  race 
and  ethnicity  were  treated  as  two  independent  variables  [OMB  1978].  Race  was  coded 
into  one  of  36  categories  on  the  basis  of  information  in  the  medical  records  [Fritz  1998]. 
If  a  person’s  race  was  recorded  in  the  medical  record  as  a  combination  of  White  and  any 
other  specific  race,  they  are  routinely  coded  by  SEER  registries  to  the  other  specific  race. 
Ethnicity  is  used  in  the  SEER  data  base  to  denote  persons  of  Hispanic  (or  Latino)  origin. 
This  group  includes  Spanish,  Mexican,  Puerto  Rican,  Cuban,  and  South  or  Central 
American  (except  Brazil).  Persons  of  Hispanic  origin  may  be  of  any  race.  Since 
information  on  the  specific  subgroup  was  available  for  less  than  half  of  the  Hispanic 
cases,  this  group  was  analyzed  in  total.  Other  racial/ethnic  groups  were  analyzed  after 
Hispanics  were  removed,  so  that  each  group  was  mutually  exclusive.  A  design  variable 
with  10  levels  was  used  to  classify  the  ten  racial/ethnic  groups  in  the  study. 

Marital  status  at  the  time  of  cancer  diagnosis 
Many  studies  over  the  past  century  have  shown  that  married  individuals  tend  to 
be  healthier  and  to  live  longer  than  non-married  individuals  [Last  1987,  Smith  1997]. 

The  mechanisms  responsible  for  this  association  are  hypothesized  to  relate  to  greater 
social  and  economic  support  and  more  healthy  lifestyles  among  the  married  [Goodwin 
1987,  Umberson  1987,  Corin  1995],  as  well  as  the  possibility  that  healthier  persons  are 
more  likely  to  be  selected  into  marriage  [Goldman  1993].  Marital  status  codes  included 
single  (never  married),  married  (including  common  law),  separated,  divorced,  widowed, 
and  unknown.  These  categories  were  collapsed  to  not  married  (i.e.,  single,  separated. 
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divorced,  or  widowed)  and  all  other  (i.e.,  married  and  unknown  marital  status)  in  the 
analysis.  The  unknowns  likely  included  a  mixture  of  married  and  not  married  persons,  so 
grouping  them  with  the  married  category  probably  diluted  the  effect  of  this  variable 
somewhat. 


SEER  area 

The  study  subjects  were  diagnosed  in  any  of  the  geographic  areas  (State  or  cluster 
of  contiguous  counties)  covered  by  the  eleven  SEER  cancer  registries.  Since  the  SEER 
registries  are  located  in  different  regions  of  the  country  which  may  have  different 
patterns  of  medical  practice  or  cancer  risk  factors,  design  variables  representing  the 
SEER  areas  were  included  in  the  analysis  to  control  for  potential  confounding. 

Census  tract  and  county  of  residence  at  the  time  of  diagnosis 

The  residence  address  of  each  cancer  case  at  the  time  of  diagnosis  was  used  by 
registry  personnel  to  assign  a  census  tract  or  block  numbering  area  and  county  code. 
Census  tracts  or  block  numbering  areas  are  the  smallest  geographic  areas  currently 
recorded  by  SEER.  They  represent  statistical  subdivisions  of  a  county  and  are 
established  and  maintained  by  local  committees.  Census  tracts  usually  contain  between 
2,500  and  8,000  people,  and  average  about  4,000  people.  The  geographic  size  of  a 
census  tract  varies,  therefore,  depending  on  how  densely  an  area  is  settled. 

Census  tracts  are  designed  to  be  homogeneous  with  respect  to  population 
characteristics,  economic  status  and  living  conditions  at  the  time  they  are  created  [Census 
1993].  At  the  time  of  the  1980  decennial  census,  not  all  counties  in  the  U.S.  were 
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covered  by  census  tracts.  The  remaining  untracted  counties  were  subdivided  into  either 
census  tracts  or  block  numbering  areas,  however,  by  the  time  of  the  1990  census.  Block 
numbering  areas  are  mutually  exclusive  of  census  tracts  and  are  generally  used  in 
sparsely  populated  counties.  As  a  result,  the  population  size  of  a  block  numbering  area  is 
typically  smaller  than  that  of  a  census  tract  and,  in  some  instances,  a  very  thinly 
populated  county  may  be  covered  by  a  single  block  numbering  area.  There  are  over 
7,900  census  tracts  or  block  numbering  areas  within  the  geographic  regions  covered  by 
the  SEER  Program  (Table  II-l). 

2.  Census  Tract-level  Variables 

The  utility  of  community-level  socioeconomic  variables  has  been  shown  in 
studies  assessing  the  impact  of  socioeconomic  position  on  hospital  admissions  [Hofer 
1998]  and  on  selected  health  outcomes  [Krieger  1992,  Anderson  1997,  Robert  1998]. 
Some  investigators  conclude  that  neighborhood-based  measures  of  socioeconomic 
position  merit  greater  use  in  public  health  research  and  surveillance  because  they 
characterize  aspects  of  a  person’s  living  conditions  that  may  not  be  evident  from 
individual-level  measures,  particularly  when  studying  diverse  racial/ethnic  groups 
[Kaplan  1996,  Krieger  1997b].  For  example,  individuals  coded  as  ‘White’  at  each 
socioeconomic  level  may  be  more  likely  to  live  in  more  affluent,  safer,  and  less  polluted 
neighborhoods  than  individuals  coded  as  ‘non- White’  [Massey  1990].  Neighborhood- 
based  measures  have  the  advantage  of  applicability  across  all  age  groups  and  both  sexes. 
They  also  tend  to  provide  a  more  stable  estimate  of  the  relevant  economic  situation  of 
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individuals  than  do  some  of  the  more  volatile  individual  measures  such  as  personal 
income.  Even  when  individual-level  data  are  available,  neighborhood-level  measures 
enable  the  conduct  of  contextual  analyses  to  determine  how  socioeconomic  factors  at 
multiple  levels  shape  population  patterns  of  health  and  disease  [Krieger  1992,  Anderson 
1997,  Krieger  1997b]. 

Working-class  job 

There  are  no  census-derived  data  which  explicitly  measure  socioeconomic  level. 
Many  socioeconomic  measures  are  based  on  an  occupational  classification,  however, 
since  occupation  is  considered  to  be  a  reliable  indicator  of  relative  standing  in  industrial 
societies  [Liberatos  1988].  Census  occupational  data  can  be  used  to  create  a  measure  of 
neighborhood  socioeconomic  level  by  selectively  combining  the  census-defined 
occupational  categories  into  a  group  that  predominantly  contains  people  in  “working 
class”  jobs  (Table  II-3).  This  group  largely  consists  of  employees  who  do  not  own  their 
own  workplace,  are  not  self-employed,  and  generally  occupy  subordinate  positions  at 
work  [Wright  1982,  Krieger  1992].  This  scheme  has  been  validated  through 
comparisons  with  individual-level  measures  of  social  standing  [Krieger  1991,  Krieger 
1 992]  and  has  been  reported  to  be  associated  with  breast  cancer  incidence  and  survival 
[Bassett  1986,  Krieger  1990],  prevalence  of  sexually  transmitted  diseases  [Ellen  1995] 
and  smoking  status,  parity,  height  and  hypertension  [Krieger  1991,  Krieger  1992].  In  the 
present  study,  this  classification  scheme  was  used  to  characterize  census  tracts  by  the 
percentage  of  employed  persons  in  the  tract,  aged  16  and  over,  that  are  in  “working- 
class”  occupations.  The  census  tract  value  for  this  variable  (and  all  subsequent  variables 
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in  this  section)  was  linked  to  individual  study  subjects  on  the  basis  of  their  residence  at 
the  time  of  their  cancer  diagnosis. 

Income 

Income  derives  from  a  variety  of  sources  including  wage  earnings,  interest, 
dividends,  child  support,  alimony,  transfer  payments,  and  pensions.  It  has  been  found  to 
be  strongly  associated  with  outcomes  ranging  from  self-perceived  health  [DHHS  1991] 
to  mortality  [Backlund  1996].  Neighborhood-level  gradients  in  income  have  also  been 
linked  to  mortality  [Smith  1996a,  Smith  1996b],  cancer  incidence  and  survival  [Devesa 
1983,  Greenwald  1996],  and  use  of  health  services  [Cherkin  1992].  A  problem  with  the 
use  of  an  income  variable  that  is  collected  only  for  one  point  in  time,  is  that  it  may  fail  to 
capture  important  information  about  income  fluctuations.  Another  weakness  is  that, 
unlike  the  poverty  variable  described  below,  the  family  or  household  income  variables 
are  not  adjusted  for  the  number  of  persons  supported  by  the  income  and  will  therefore 
have  different  meanings  for  different  size  households.  The  census  tract-level  measure  of 
median  family  income  was  used  in  the  present  study. 

Poverty 

The  poverty  threshold  set  by  the  Bureau  of  the  Census  is  an  economic  indicator  of 
need.  Unlike  measures  of  median  family  income,  poverty  status  takes  into  account  the 
size  and  age  structure  of  a  family  and  is  related  to  the  ability  to  purchase  a  specific 
market  basket  of  food  [Census  1992].  In  1989,  the  average  poverty  threshold  for  a 
family  of  four  persons  was  $12,674.  Poverty  thresholds  are  applied  on  a  national  basis. 
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without  adjustment  for  regional,  State  or  local  variations  in  the  cost  of  living.  The 
poverty  status  variable  represents  the  percentage  of  all  persons  in  a  census  tract  who  are 
living  below  the  poverty  threshold  for  their  given  family  size.  Federally  defined  poverty 
areas  are  those  in  which  20%  or  more  of  the  population  lives  below  the  poverty  line 
[Census  1985].  This  definition  of  poverty  area,  applied  to  census  block  group  data,  has 
been  associated  with  several  health  outcomes  [Krieger  1990,  Krieger  1991,  Krieger  1992, 
Ellen  1995].  Another  poverty  variable  included  for  the  analysis  indicates  the  percentage 
of  families  headed  by  women  with  no  husband  at  home,  with  one  or  more  children,  and 
who  are  living  below  the  poverty  level.  It  is  possible  that  this  variable  captures 
additional  factors,  such  as  increased  time  demands  or  stress,  that  may  help  to  explain 
patterns  of  health  care  utilization.  The  utilization  patterns  may,  in  turn,  influence  the 
severity  of  disease  at  the  time  of  diagnosis. 

Wealth 

Privilege  and  wealth  represent  the  opposite  end  of  the  socioeconomic  spectrum 
from  deprivation  and  poverty.  Wealth  encompasses  accumulated  assets,  usually  obtained 
through  inheritance,  investment  or  other  forms  of  saving  [Krieger  1997b].  Homes  and 
cars  represent  the  most  commonly  owned  assets  in  the  United  States  and  information  on 
ownership  can  usually  be  obtained  easily  and  reliably.  European  studies  have  reported 
associations  between  car  and  home  ownership  and  mortality  rates  [Filakti  1995]  and 
cancer  survival  [Petridou  1994].  The  percentage  of  households  in  a  census  tract  that  own 
their  home  and  the  percentage  that  do  not  own  a  car  were  calculated  from  the  1990 


census  data. 
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Education 

Education  is  another  widely  used  indicator  of  socioeconomic  position  in  public 
health  research  and  has  been  shown  to  be  an  important  predictor  of  mortality  and 
morbidity  [Feldman  1989,  Reis  1991,  Heck  1997,  Krieger  1997b].  The  amount  of 
education  and  knowledge  attained  by  an  individual  influences  lifestyle  behaviors  (e.g., 
exercise,  diet)  and  may  also  provide  qualifications  for  certain  occupations  and  income 
[Liberates  1988].  Its  advantages  include  ease  of  measurement;  relevance  for  persons  not 
actively  employed  (e.g.,  house  parent,  unemployed,  retired);  and  its  stability  over  the 
adult  lifespan,  regardless  of  changes  in  health  status.  Education  was  selected  as  a 
practical  measure  for  socioeconomic  position  in  the  1989  revision  of  the  U.S.  standard 
death  certificate  [Tolson  1991].  Some  investigators  suggest  that  it  is  more  meaningful  to 
measure  education  in  terms  of  certification  or  academic  degrees  achieved,  rather  than  by 
the  number  of  years  of  schooling,  because  the  academic  credentials  have  important 
implications  for  employment  prospects  [Faia  1981,  Liberates  1988].  When  under- 
educated  areas  are  defined  as  census  tracts  in  which  25%  or  more  of  adults  age  25  and 
older  have  not  completed  high  school,  associations  between  this  ecological  measure  and 
selected  health  characteristics  were  found  to  be  similar  to  associations  based  upon 
education  data  for  individuals  [Krieger  1992].  The  percentage  of  persons  aged  25  years 
and  older  who  did  not  have  at  least  a  high  school  diploma  and  the  percentage  who  had  a 
bachelor’s  degree  or  higher  was  calculated  for  each  census  tract  in  the  study  and  applied 
to  all  study  subjects  within  each  of  the  tracts. 
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Urban  residence 

Urban/rural  designations  are  one  of  the  most  commonly  used  ecological  variables 
in  health  research,  however,  the  effect  that  an  urban  environment  may  be  expected  to 
have  on  health  is  unclear.  Most  urban  environments  have  positive  and  negative  qualities, 
and  these  qualities  are  not  experienced  equally  by  all  residents  [Verheij  1996].  Besides 
differences  in  exposure  to  physical  risk  factors  (e.g.,  noise,  pollution)  or  access  to  health 
care,  people’s  values  and  attitudes  about  health  may  differ  in  urban  versus  rural  areas. 
This  can  lead  to  differences  in  health  behaviors  (e.g.,  diet,  exercise,  smoking,  care¬ 
seeking)  and  ultimately  health  status  [Patrick  1995].  The  percentage  of  the  population  in 
a  census  tract  living  in  an  urban  area  was  calculated  from  the  1990  census  data.  About 
77%  of  the  study  subjects  were  classified  as  living  in  a  census  tract  that  is  considered  to 
be  100%  urban.  Because  of  the  extreme  skewness  of  the  study  data,  this  variable  was 
coded  as  a  binary  variable  with  census  tracts  classified  as  urban  (100%  of  the  population 
lives  in  an  urban  area)  or  not  urban  (<100%  of  the  population  lives  in  an  urban  area). 

Unemployment 

Most  individuals  in  the  United  States  obtain  health  insurance  through  an 
employer.  The  possession  of  health  insurance,  in  turn,  influences  access  to  health  care, 
including  preventive  care.  Areas  with  high  unemployment  among  persons  in  the  labor 
force  may  therefore  be  related  to  tumor  characteristics  that  influence  prognosis.  The 
percentage  of  the  population  that  was  unemployed  among  those  in  the  labor  force  was 
calculated  from  the  1990  census  data.  Excluded  from  the  labor  force  were:  l)  persons 
under  16  years  of  age,  and  2)  among  those  over  age  16  -  students,  housewives,  retired 
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workers,  seasonal  workers  enumerated  in  an  “off”  season  who  were  not  looking  for  work, 
institutionalized  persons,  and  persons  doing  only  incidental  unpaid  family  work  [Census 
1992]. 


Foreign-born 

The  relationship  between  migration  and  health  may  vary  by  socioeconomic 
position  and  by  the  reason  for  migrating  (e.g.,  political,  economic).  Migrants  usually 
adopt  at  least  some  of  the  cultural  characteristics  of  the  community  into  which  they 
move,  although  this  may  take  as  much  time  as  a  generation  or  more  [Last  1987].  Affects 
on  health  may  be  mediated  through  changes  in  diet  and  levels  of  stress.  The  percentage 
of  the  population  in  each  census  tract  that  was  bom  outside  of  the  United  States  is 
calculated  from  the  1990  census  data. 

H.  Data  Quality 

All  data  on  cancer  cases  were  collected  by  specially-trained  medical  records 
abstractors  following  well-documented,  standardized  procedures  [Fritz  1998].  Collected 
data  have  passed  extensive  field  and  central  office  quality  control  edits.  Overall 
completeness  of  case  reporting  by  the  SEER  cancer  registries  has  been  measured  to  be 
about  98%,  based  on  independent  audits  of  a  stratified  random  sample  of  hospitals  in  six 
of  the  coverage  areas  [Zippin  1995].  Cancers  of  the  cervix  (in  situ),  melanoma,  unknown 
primary,  and  leukemia  were  disproportionately  represented  among  the  missing  cases, 
based  upon  the  overall  distribution  of  cancers  reported  by  the  registries  [Zippin  1995]. 
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Since  there  is  wide  variability  in  the  classification  of  in  situ  cancers  of  the  cervix,  these 
tumors  are  no  longer  routinely  collected  by  SEER  registries.  The  impact  of  this  change 
is  to  further  increase  the  overall  completeness  of  case  reporting.  Cancers  of  the  breast 
are  not  among  the  specific  types  of  cancer  reported  to  be  missing  in  excess  of 
expectation. 

Ninety-nine  percent  of  the  106,607  breast  cancer  cases  in  the  study  population 
had  evidence  in  the  medical  record  of  microscopic  confirmation  of  the  diagnosis.  A 
reabstracting  study  of  breast  cancer  diagnoses  in  1992  [Zippin  unpublished],  based  on  a 
stratified  random  sample  (n=l,100)  of  cases  from  hospital  facilities  in  all  of  the  SEER 
reporting  areas,  found  discrepancies  in  extent  of  disease  codes  that  resulted  in  changing 
the  summary  stage  of  disease  category  for  3.8%  of  the  cases.  Among  these,  six  cases 
(0.5%)  were  misclassified  as  invasive  tumors  when  further  investigation  indicated  that 
they  were  in  situ  lesions.  The  results  from  this  reabstracting  study  were  used  to  develop 
a  new  set  of  abstracting  and  coding  guidelines  which  serve  as  training  materials  in  annual 
workshops  for  data  managers  and  were  adapted  for  use  in  data  editing  software. 

Special  revisions  (described  in  Section  II. D.  Data  Linkage)  were  made  to  data 
collection  manuals  and  procedures  and  new  data  edits  were  implemented  in  preparation 
for  conducting  the  proposed  study  of  socioeconomic  factors  and  racial/ethnic  differences 
in  tumor  characteristics  for  cancer  of  the  female  breast.  These  efforts  resulted  in 
improved  assignment  of  valid  census  tract  codes  to  cancer  patient  records  used  in  this 
study.  This  enabled  a  more  complete  linkage  of  census  demographic  data  to  study 
records  and  thereby  reduced  the  potential  affect  of  a  reporting  bias  on  study  findings. 
Since  these  changes  are  now  incorporated  into  the  standard  data  collection  and 


45 

management  operations  of  the  SEER  registries  and  the  NCI,  they  will  result  in  a 
permanent  improvement  in  the  overall  quality  and  utility  of  the  SEER  Program  data  base. 

I.  Characterization  of  Study  Subjects  Excluded  from  Analysis 

1.  Analysis  by  Stage  of  Disease 

The  distribution  of  study  subjects  (n=T06,607)  by  raciai/ethnic  group  and 
availability  of  information  on  stage  of  disease  at  the  time  of  diagnosis  is  shown  in  Table 
II-4.  Ninety-seven  percent  (n=103,371)  of  the  study  subjects  in  these  ten  racial/ethnic 
groups  had  sufficient  information  from  medical  records  to  assign  a  tumor  stage.  The 
percentage  of  unstaged  cases  is  slightly  higher  among  Blacks  and  Koreans  than  among 
the  other  specific  groups.  The  percentage  of  cases  with  staging  information  was  fairly 
consistent  across  age  groups  with  the  exception  of  women  diagnosed  at  age  90  and  over 
(Table  II-5).  Although  this  group  had  the  highest  percentage  of  cases  with  missing 
tumor  stage,  they  accounted  for  only  8%  of  all  cases  without  staging  data. 

Socioeconomic  information  could  not  be  linked  to  3.7%  of  the  staged  cases. 

Most  of  these  were  due  to  incomplete  or  non-specific  residence  address  information  (e.g., 
missing  house  number,  post  office  box  number,  rural  route  number)  which  prevented  the 
assignment  of  a  valid  census  tract  code.  The  age  distribution  of  those  missing 
socioeconomic  data  was  similar  to  those  with  complete  information.  The  Hawaii  cancer 
registration  area  had  the  largest  percentage  of  cases  that  could  not  be  linked  to  census 
tract  socioeconomic  information  (14%),  while  all  other  areas  had  fewer  than  10%  of  their 
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cases  that  could  not  be  linked  (Table  11-6).  As  a  result  of  the  poorer  match  rate  in 
Hawaii,  the  percentage  of  native  Hawaiians  missing  socioeconomic  data  was  higher  than 
that  for  other  racial/ethnic  groups  (Table  II-7). 

2.  Analysis  by  Tumor  Size 

The  distribution  of  study  subjects  (n= 106,607)  by  racial/ethnic  group  and 
availability  of  information  on  tumor  size  at  the  time  of  diagnosis  is  shown  in  Table  II-8. 
Tumor  size  was  available  from  medical  records  for  91  percent  (n=96,871)  of  the  study 
group.  A  small  number  of  the  cases  (0.1%)  had  a  diagnosis  of  Paget’s  disease,  which 
refers  to  neoplastic  eczematous  changes  around  the  nipple.  These  cases  were  not 
associated  with  an  underlying  invasive  tumor  mass,  and  therefore,  had  no  tumor  size 
measurement.  Black  women  had  a  higher  percentage  of  cases  lacking  tumor  size 
information  than  the  other  racial/ethnic  groups.  Availability  of  tumor  size  data  was  fairly 
consistent  by  age  group,  with  the  exception  of  women  aged  90  years  and  older.  The 
oldest  age  women  had  the  highest  percentage  of  cases  with  missing  tumor  size 
information,  but  this  group  accounted  for  only  3%  of  all  cases  lacking  tumor  size  data 
(Table  II-9). 

Socioeconomic  information  could  not  be  linked  to  3.7%  of  the  cases  with  tumor 
size  information.  The  age  distribution  of  those  missing  socioeconomic  data  was  similar 
to  those  with  complete  information.  The  Hawaii  cancer  registration  area  had  the  largest 
percentage  of  cases  that  could  not  be  linked  to  census  tract  socioeconomic  information 
(14%),  while  all  other  areas  had  fewer  than  9%  of  their  cases  that  could  not  be  linked 
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(Table  11-10).  As  a  result  of  the  poorer  match  rate  in  Hawaii,  the  percentage  of  native 
Hawaiians  missing  socioeconomic  data  was  higher  than  that  for  other  racial/ethnic 
groups  (Table  11-11). 

J.  Analytic  Methods 

As  stated  earlier,  a  population-based  case-control  design  was  chosen  to 
investigate  the  importance  of  socioeconomic  position  in  explaining  racial/ethnic 
differences  in  tumor  characteristics  among  women  newly  diagnosed  with  invasive 
primary  breast  cancer  during  the  years  1992  through  1996  in  one  of  the  eleven  reporting 
areas  comprising  the  Surveillance,  Epidemiology,  and  End  Results  (SEER)  program  of 
the  National  Cancer  Institute.  The  results  of  preliminary  two-way  comparisons  between 
outcome  variables  (tumor  stage,  tumor  size)  and  explanatory  variables  were  expressed  as 
unadjusted  odds  ratios  with  95%  confidence  intervals.  The  odds  of  being  a  case  for 
specific  racial/ethnic  groups  were  compared  to  that  for  White  women,  primarily  because 
they  were  the  largest  group  available  for  study.  The  odds  ratios  reflect  the  odds  of  being 
diagnosed  with  late  stage  disease  (or  larger  tumor  size)  in  the  “exposed”  group  (e.g.. 
Black,  Hispanic,  Japanese,  etc.)  relative  to  the  odds  for  a  similar  diagnosis  among  White 
women. 

Multiple  regression  models  were  developed  to  evaluate  the  importance  of 
socioeconomic  variables  and  other  demographic  factors  in  explaining  racial/ethnic 
differences  in  tumor  stage  and  size  at  the  time  of  diagnosis.  The  SAS  statistical  program 
LOGISTIC  Procedure  [SAS  Logistic  1989]  was  used  since  the  outcome  variables  for  the 
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regression  analysis  are  binary.  The  outcome  variables  included:  1)  distant  stage  breast 
cancer  vs.  localized  stage;  2)  regional  stage  disease  vs.  localized  stage;  and  3)  primary 
breast  tumor  size  >1.0  cm  vs.  ^  1.0  cm.  Several  investigators  have  reported  a  diminishing 
of  the  effect  of  socioeconomic  factors  on  mortality  with  increasing  age.  To  determine 
whether  a  similar  relationship  might  hold  in  the  present  study,  where  the  health  outcome 
is  severity  of  disease,  terms  for  the  cross-product  of  age  at  diagnosis  and  socioeconomic 
variables  are  considered  for  inclusion  in  the  models  as  potential  confounders  [Sorlie 
1992,  Backiund  1996,  Kaufman  1998].  Finally,  the  significance  of  interaction  terms 
between  racial/ethnic  group  and  sociodemographic  factors  were  evaluated  in  the  models. 
Thus,  six  types  of  factors  were  examined  in  the  models:  (1)  age  at  diagnosis, 

(2)  geographic  area,  (3)  sociodemographic  factors,  (4)  tumor  biology,  (5)  cross-product 
terms  of  socioeconomic  factors  and  age  at  diagnosis  considered  as  potential  confounders, 
and  (6)  interaction  terms  between  racial/ethnic  group  and  socioeconomic  factors. 

Odds  ratios  associated  with  the  explanatory  variables  were  computed  by 
exponentiating  the  estimated  coefficients  of  the  fitted  logistic  model  and  are  presented 
with  their  95%  profile  likelihood  confidence  limits.  In  some  instances,  it  is  of  greater 
interest  to  present  the  change  in  the  odds  ratio  for  something  larger  than  a  one-unit 
change  in  the  explanatory  variable  (e.g.,  socioeconomic  variables  with  values  ranging 
from  0  to  100).  The  odds  ratio  may  be  customized  in  these  cases  by  multiplying  the 
estimated  coefficient  by  a  constant  c  (where  c  represents  a  change  of  say,  10  or  20  units) 
and  then  exponentiating  the  product.  The  goal  of  the  modeling  was  to  determine  the 
degree  to  which  racial/ethnic  differences  in  tumor  stage  and  size  could  be  explained  by 
socioeconomic  factors.  Therefore,  changes  in  the  magnitude  of  the  odds  ratios,  after 
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socioeconomic  variables  and  other  control  variables  are  added  to  logistic  regression 
models  containing  race/ethnicity,  are  used  to  assess  the  importance  of  these  additional 
factors.  Since  the  main  purpose  of  the  regression  analysis  was  to  assess  confounding  by 
socioeconomic  factors,  model-building  strategies  such  as  stepwise  or  best  subsets  were 
not  appropriate.  The  logistic  regression  models  were  also  used  to  calculate  the  adjusted 
proportion  with  late  stage  disease  or  larger  tumor  size  among  selected  racial/ethnic 
groups  by  education,  tumor  grade  and  hormone  receptor  status.  This  method 
standardized  the  proportions  to  the  distribution  of  the  remaining  covariates  in  the  full 
regression  models  for  the  entire  study  group  [Graubard  1999].  Ninety-five  percent 
confidence  limits  for  these  proportions  were  calculated  as  ±  1.96  times  the  standard  error. 

Since  colinearities  among  the  independent  variables  could  have  resulted  in 
unrealistically  large  estimated  coefficients  and  standard  errors  [Hosmer  1989],  a 
correlation  matrix  was  constructed  to  identify  socioeconomic  variables  with  strong  linear 
relationships  [SAS  Corr  1990],  The  goodness-of-fit  of  the  models  was  assessed  using  the 
Hosmer  and  Lemeshow  summary  statistic  [Hosmer  1989].  For  this  statistic,  observed 
and  expected  numbers  of  observations  were  calculated  for  each  of  ten  groups  of 
approximately  equal  size  based  on  the  percentiles  of  the  estimated  probabilities  of  an 
event  (an  event  is  defined  as:  distant  stage  disease,  regional  stage  disease,  or  tumor 
greater  than  1  cm  in  diameter).  Observations  were  sorted  in  increasing  order  of  their 
estimated  probability  of  having  an  event  outcome.  The  discrepancies  between  the 
observed  and  expected  number  of  observations  in  these  groups  were  summarized  by  the 
Pearson  chi-square  statistic  and  compared  to  a  chi-square  distribution  with  degrees  of 
freedom  equal  to  the  number  of  groups  minus  two. 
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K.  Human  Subjects  and  Confidentiality 

I  obtained  Institutional  Review  Board  (IRB)  approval  from  the  Uniformed 
Services  University  of  the  Health  Sciences  for  this  study  involving  human  subjects  under 
Project  Number  T087KO-01.  All  information  collected  on  study  subjects  was  treated  as 
confidential.  Study  ID  numbers  replaced  the  subject’s  name  on  all  data  files  submitted  to 
the  NCI  for  use  in  this  study. 

L.  Roles  as  Study  Investigator 

My  roles  as  the  study  investigator  included: 

1 .  Conducting  the  geocoding  meeting  and  data  linkage  feasibility  assessment; 

2.  Developing  revised  data  coding  instructions  and  creating  a  new  variable  to  be 
reported  by  all  SEER  cancer  registries  and  documented  as  changes  in  the  SEER 
Coding  Manual; 

3.  Conceiving  the  research  questions; 

4.  Developing  the  study  design,  including  selection  of  the  analytic  methods  and 
obtaining  IRB  approval; 

5.  Designing  and  directing  the  creation  of  the  data  files  needed  for  the  analysis;  and 

6.  Conducting  all  of  the  data  analysis. 


CHAPTER  III.  RESULTS 


A.  Descriptive  Analysis 

There  were  a  number  of  racial/ethnic  differences  in  tumor  characteristics  and 
sociodemograhic  factors  among  the  invasive  breast  cancer  patients  included  in  this  study 
(Table  III-la,b).  Due  to  the  large  size  of  this  study  population,  chi-squared  tests  for 
ordinal  and  nominal  response  variables  indicated  that  there  were  statistically  significant 
racial/ethnic  differences  for  all  study  factors.  Japanese  and  White  women  tended  to  be 
diagnosed  at  an  earlier  stage,  with  smaller  diameter  tumors  and  at  a  lower  tumor  grade 
than  other  groups.  Black  and  Hispanic  women  were  more  likely  than  other  groups  to  be 
diagnosed  with  metastatic  disease,  have  tumors  2  cm  or  larger  in  diameter,  and  have 
poorly  differentiated  tumors.  American  Indian  and  Vietnamese  patients  also  had  a  higher 
percentage  of  advanced  disease  than  other  groups.  Relative  to  Japanese  and  White 
patients,  a  larger  percentage  of  the  tumors  among  all  other  racial/ethnic  groups  were  2.0 
cm  or  greater  at  the  time  of  diagnosis.  Korean  and  Vietnamese  women  were  also  more 
likely  to  have  poorly  differentiated  tumors.  Black,  Korean  and  American  Indian  women 
had  the  highest  percentage  of  tumors  that  were  negative  for  hormone  receptors. 

There  were  also  notable  differences  in  social  and  economic  factors  among  the 
groups.  Black  women  were  less  likely  to  be  married  at  the  time  of  diagnosis.  American 
Indian,  Hispanic  and  Black  women  were  similar  with  regard  to  many  of  the  census  tract 
level  indicators  of  socioeconomic  position.  They  were  much  more  likely  to  be  living  in 
less  educated  and  poorer  neighborhoods,  as  measured  by  the  percentage  of  residents 
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without  a  high  school  diploma,  median  family  income,  and  the  percentage  living  below 
the  poverty  level.  They  also  tended  to  live  in  areas  where  unemployment  was  higher; 
where  residents  held  “working  class”  jobs;  and  where  a  high  percentage  of  families  were 
headed  by  women  having  one  or  more  children,  with  no  husband  living  at  home,  and 
whose  income  is  below  the  poverty  level.  Korean,  Vietnamese  and  Black  patients  lived 
more  frequently  in  areas  where  home  ownership  was  lowest,  and  Black  and  American 
Indian  patients  lived  in  areas  where  fewer  households  owned  a  car. 

Tumor  grade  information  was  missing  from  hospital  records  for  25%  of  the  study 
cases.  Those  missing  tumor  grade  were  more  likely  to  have  distant  stage  disease  (8% 
versus  5%  with  distant  stage  among  those  having  information  on  tumor  grade)  or  to  be 
missing  staging  information  (7%  versus  2%),  and  to  be  aged  80  years  and  over  at  the 
time  of  diagnosis  (16%  versus  1 1%).  Black  women  had  a  higher  percentage  of  tumors 
classified  as  unknown  grade  than  other  racial/ethnic  groups.  Since  high  grade  tumors  are 
associated  with  poorer  survival  [Henson  1991],  survival  rates  were  compared  among 
patients  by  tumor  grade  as  an  aid  to  developing  a  coding  scheme  that  would  best  utilize 
the  available  information.  The  following  five-year  cumulative  relative  survival 
probabilities  by  grade  for  patients  in  this  study:  grade  1:  99%;  grade  2:  91%;  grade  3: 
74%;  grade  4:  76%;  unknown  grade:  83%.  Therefore,  for  use  as  an  explanatory  variable 
in  the  regression  analysis,  tumor  grade  was  re-coded  as  a  binary  variable  with  poorly  and 
undifferentiated  tumors  combined  (grades  3  and  4  were  coded  as  1)  versus  all  others  (i.e., 
well  differentiated,  moderately  differentiated,  and  unknown  tumor  grade  were  coded  as 
0).  Since  patients  with  unknown  tumor  grade  probably  include  a  mix  of  those  with  high 
and  low  tumor  grades,  a  separate  analysis  was  conducted  excluding  them  from  the 
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regression  model  to  determine  the  impact  on  our  study  findings. 

Information  on  hormone  receptor  status  was  not  available  for  23%  of  the  intended 
study  population  (4%  of  these  patients  had  a  tumor  marker  assay  performed  but.results 
were  not  included  in  the  medical  record,  6%  did  not  have  an  assay  performed,  and  13% 
had  no  information).  Patients  lacking  hormone  receptor  status  information  were  more 
likely  to  have  distant  stage  disease  (10%  versus  4%  for  those  with  information  on 
hormone  receptor  status)  or  to  be  missing  staging  information  (9%  versus  1%),  and  to  be 
aged  80  years  or  more  at  the  time  of  diagnosis  (16%  versus  1 1%).  Forty  percent  of  the 
patients  without  information  on  hormone  receptor  status  were  also  missing  tumor  grade 
information  versus  21%  of  those  with  hormone  receptor  status.  Hormone  receptor  status 
was  missing  more  frequently  among  Black  and  Hispanic  women,  than  for  other  racial 
ethnic  groups.  One  percent  of  the  study  cases  were  classified  as  having  a  “borderline” 
test  result  for  ER  status.  A  similar  percentage  of  cases  were  also  reported  to  have  a 
borderline  test  result  for  PR  status.  Since  hormone  receptor  status  reflects  characteristics 
of  the  biology  of  breast  tumors,  survival  rates  were  compared  among  study  patients 
according  to  their  hormone  receptor  status  as  an  aid  to  developing  a  coding  scheme  that 
would  best  utilize  the  available  ER  and  PR  status  information.  Five-year  cumulative 
relative  survival  probabilities  were  highest  for  patients  in  this  study  with  either  positive 
ER  status  or  positive  PR  status  (89%)  and  lowest  for  patients  with  both  negative  ER 
status  and  negative  PR  status  (75%).  Patients  with  all  remaining  combinations  of  codes 
ER  status  and  PR  status  experienced  a  survival  probability  closest  to  that  of  the  hormone 
receptor  negative  patients  (78%).  For  the  regression  analysis,  hormone  receptor  status  is 
re-coded  as  a  binary  variable  with  tumors  classified  as  being  positive  for  either  ER  or  PR 
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(coded  as  0),  versus  all  others  (coded  as  1 ). 

The  predominant  histological  type  of  breast  cancer  in  the  study  group  was  ductal 
adenocarcinoma  which  accounted  for  over  73%  of  the  cases  (Table  III-2.).  Other 
histological  types  included  lobular  (13%),  adenocarcinoma  not  otherwise  specified  (4%), 
mucinous  (2.5%),  carcinoma  not  otherwise  specified  (2%),  medullary  carcinoma  (1.3%) 
and  inflammatory  carcinoma  (1.1%).  Racial/ethnic  variation  in  histological  type  was 
slight  with  White  patients  tending  to  have  a  higher  percentage  of  lobular  carcinomas 
(14%)  and  a  lower  percentage  of  ductal  adenocarcinomas  (73%)  than  other  groups. 
Lobular  carcinoma  ranged  from  6%  of  the  cases  among  Chinese  women  to  1 1%  among 
Hispanic  women  and  ductal  adenocarcinoma  ranged  from  73%  of  the  cases  among 
Hispanic  women  to  82%  among  Korean  women. 

The  relationships  between  study  outcomes  (tumor  stage  or  size)  and  potential 
explanatory  variables  are  summarized  (1)  in  plots  of  the  log-odds  of  being  a  “case”  for 
various  levels  of  each  explanatory  factor  (Figures  Ul-la-c,  III-2a-c,  and  III-3a-c)  and 
(2)  in  2x2  tables  for  binary  explanatory  variables  (Tables  III-3-5).  With  the  exception  of 
the  oldest  age  group,  age  at  diagnosis  appears  to  be  negatively  associated  with  a 
diagnosis  of  distant  or  regional  stage  breast  cancer  and  with  tumors  1  cm  or  greater  in 
diameter  (Figures  III- la,  2a,  3a).  Measures  of  socioeconomic  position  are  often  treated 
as  categorical  variables  in  epidemiologic  studies.  In  the  present  study,  however,  most  of 
the  socioeconomic  variables  show  strong  linear  (on  a  log  scale)  associations  with  more 
advanced  stage  of  disease  and  larger  tumor  size.  Therefore,  they  are  treated  as 
continuous  variables  in  the  regression  analysis.  The  log-odds  of  being  a  care  is 
positively  associated  with  the  percentage  of  persons  without  a  high  school  diploma;  the 
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percentage  living  below  the  poverty  level;  the  percentage  of  families  headed  by  women 
having  one  or  more  children,  with  no  husband  living  at  home,  and  whose  income  is 
below  the  poverty  level;  the  percentage  of  persons  in  “working  class”  jobs;  the 
percentage  of  persons  that  are  unemployed;  and  the  percentage  of  households  that  do  not 
own  a  car.  The  log-odds  of  being  a  case  is  negatively  associated  with  median  family 
income  and  with  the  percentage  of  households  that  own  their  own  home.  These  patterns 
of  association  for  the  socioeconomic  variables  are  consistent  for  each  of  the  study 
outcomes.  Since  the  percentage  of  foreign-bom  residents  did  not  show  an  association 
with  the  outcome  variables,  either  in  the  log-odds  plots  or  when  used  as  a  single  predictor 
in  a  logistic  regression  model,  it  was  dropped  from  further  analysis. 

Tables  III-3-5  show  that  having  a  high  grade  tumor  is  positively  associated  with 
late  stage  disease  and  larger  tumor  size  in  the  univariate  analysis  (OR=2.2  for  both 
distant  and  regional  stage  disease  and  OR=3. 1  for  tumors  equal  or  greater  than  1cm). 
Negative  hormone  receptor  status  is  positively  associated  with  distant  stage  disease  but 
not  with  regional  stage  or  with  tumor  size.  Patients  that  are  not  married  at  the  time  of 
diagnosis  show  a  positive  association  with  distant  stage  disease  and  larger  tumor  size,  but 
no  association  with  regional  stage  disease.  Living  in  an  urban  area  is  weakly  associated 
with  each  of  the  study  outcomes  (95%  Cl  does  not  include  1.0,  but  this  is  not  apparent 
when  the  odds  ratio  is  rounded  to  a  single  decimal  place). 
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B.  Regression  Analysis 

A  matrix  of  Pearson  correlation  coefficients  was  constructed  to  identify 
socioeconomic  variables  having  strong  linear  relationships  (Appendix  III-l). 
Colinearities  among  the  independent  variables  may  produce  inflated  estimated 
coefficients  and/or  standard  errors  in  the  multiple  logistic  regression  models.  As  might 
be  expected,  the  percentage  of  unemployed  persons  was  highly  correlated  with  the  two 
poverty  measures  (overall  percent  below  poverty;  the  percentage  of  families  headed  by 
women  having  one  or  more  children,  with  no  husband  living  at  home,  and  whose  income 
is  below  the  poverty  level).  The  unemployment  variable  was  therefore  excluded  from  the 
regression  analysis.  Since  the  overall  poverty  variable  showed  strong  linear  relationships 
with  the  second  poverty  variable  and  with  the  percentage  of  residents  who  did  not  own  a 
car,  it  was  also  excluded  from  the  regression  analysis. 

The  results  from  multiple  logistic  regression  models  used  to  assess  the  importance 
of  selected  study  factors  in  explaining  racial/ethnic  differences  in  the  severity  of  disease 
at  the  time  of  diagnosis  are  shown  in  Tables  III-6-8.  Regression  models  that  included 
the  socioeconomic  x  age  product  terms  as  potential  confounders  did  not  result  in  any 
meaningful  change  in  the  magnitude  of  the  odds  ratios  for  each  of  the  racial/ethnic 
groups  and  there  was  no  improvement  in  the  precision  of  the  odds  ratios,  so  they  were 
dropped  from  the  analysis.  Interactions  between  racial/ethnic  group  and  socioeconomic 
variables  were  not  statistically  significant  at  the  5%  level,  so  they  were  also  excluded 
from  the  regression  models. 

The  first  model  in  Table  III-6  shows  odds  ratios  (ORs)  for  being  diagnosed  with 
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distant  stage  disease  versus  localized  disease  among  the  racial/ethnic  groups  relative  to 
White  women  with  an  adjustment  only  for  age  at  diagnosis.  Odds  ratios  for  Black, 
Hispanic,  and  American  Indian  women  were  elevated  while  those  for  Japanese  were 
significantly  reduced.  Odds  ratios  for  other  groups  were  not  significantly  different  from 
1 .0.  The  addition  of  geographic  area  (registry  where  diagnosed,  urban  residence)  to  the 
model,  slightly  lowered  the  OR  for  American  Indian  women,  while  the  OR  for  Hispanic 
women  remained  unchanged.  When  sociodemographic  factors  were  incorporated  in  the 
model  the  excess  odds  for  Hispanic  women  was  reduced  by  60%  and  the  excess  odds  for 
American  Indian  and  Black  women  were  further  lowered  (by  62%  and  50%, 
respectively).  In  the  final  model,  which  includes  tumor  grade  and  hormone  receptor 
status,  only  the  OR  for  Black  women  remained  significantly  elevated  at  1.3.  The  OR  for 
Hawaiian  women  was  elevated  even  after  adjustment  for  other  study  factors,  but  the  95% 
confidence  interval  included  one. 

Table  III-7  shows  a  similar  analysis  of  the  importance  of  study  factors  in 
explaining  racial/ethnic  differences  in  the  diagnosis  of  regional  stage  disease  versus 
localized  disease.  Initial  ORs  for  Black  and  Hispanic  women  are  again  elevated  relative 
to  White  women,  though  at  somewhat  lower  levels  than  those  seen  for  distant  stage 
disease.  The  OR  for  Japanese  women  is  significantly  lower  than  that  for  White  women, 
while  ORs  for  the  other  groups  are  not  significantly  different  from  1 .0.  The  addition  of 
geographic  area  and  sociodemographic  factors  reduced  the  ORs  for  Black  and  Hispanic 
women,  but  no  further  reduction  occurred  after  the  inclusion  of  biological  characteristics 
of  the  tumors.  The  lower  OR  for  Japanese  women  remained  unchanged  after  the  addition 
of  each  group  of  potential  confounding  factors. 
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Comparisons  of  the  ORs  for  larger  tumor  size  by  racial/ethnic  group  are  shown  in 
Table  III-8.  In  the  initial  model,  ORs  adjusted  for  age  at  diagnosis  were  significantly 
high  for  Black,  Hispanic,  Filipino,  and  Korean  women  relative  to  White  patients.  These 
elevated  ORs  were  partially  explained  by  sociodemographic  factors  and  other  tumor 
characteristics,  but  remained  high  for  all  groups.  The  OR  for  Japanese  women  was 
significantly  reduced  and  remained  unchanged  after  each  set  of  study  variables  was 
added  to  the  model. 

Estimates  of  the  strength  of  the  associations  of  sociodemographic  variables, 
tumor  grade,  and  hormone  receptor  status  (considered  as  confounders  in  this  study)  and 
the  three  study  outcomes,  while  not  the  focus  of  this  study,  are  shown  in  Appendices  III- 
2-4.  Among  the  sociodemographic  factors,  not  being  married  and  living  in  areas  where  a 
high  percentage  of  persons  do  not  have  a  high  school  diploma  are  consistently  associated 
with  a  more  advanced  stage  of  disease  at  diagnosis  and  larger  tumor  size.  Median  family 
income  is  negatively  associated  with  distant  stage  disease  and  larger  tumor  size,  but  is 
not  a  significant  predictor  of  regional  stage  disease. 

The  estimated  percentages  of  White,  Black,  Hispanic  and  Japanese  patients  with 
late  stage  disease  or  larger  tumor  size  by  education,  hormone  receptor  status,  and  tumor 
grade  are  shown  in  Tables  III-9-11.  The  percentages  are  adjusted  for  all  other  covariates 
in  the  full  regression  models  and  provide  an  alternative  to  the  odds  ratio  in  assessing  the 
influence  of  these  factors  on  the  disease  outcomes.  The  effects  of  education  (as  an 
indicator  of  socioeconomic  position)  and  tumor  biology  (hormone  receptor  status,  grade) 
are  greater  for  tumor  stage  than  for  tumor  size. 

Results  from  the  Hosmer-Lemshow  goodness-of-fit  test  for  the  full  logistic 
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regression  model  corresponding  to  each  study  outcome  are  shown  in  Figures  III-4-6. 

The  observed  and  expected  number  of  observations  appear  to  be  fairly  close  in  each 
model.  Perhaps  due  to  the  extremely  large  size  of  the  study  populations,  however,  the 
Hosmer-Lemeshow  test  statistics  indicate  that  the  models  for  distant  stage  disease  and 
regional  stage  disease  (vs.  localized  stage)  do  not  fit  the  observed  data  well.  The  model 
for  tumor  size  does  fit  the  observed  data  well,  based  upon  the  non-significant  Hosmer- 
Lemeshow  test  statistic  (p  =  0.12). 

The  regression  analysis  was  repeated  after  excluding  patients  that  were  missing 
information  on  tumor  grade.  The  pattern  of  associations  between  the  specific 
racial/ethnic  groups  and  cancer  outcomes  remained  unchanged.  Only  the  odds  ratios  for 
tumor  grade  and  hormone  receptor  status  in  the  distant  vs.  localized  tumor  stage  analysis 
changed  noticeably.  The  odds  ratio  for  high  tumor  grade  increased  from  2.0  to  3.6  (95% 
Cl  =  3.4-3 .9)  and  the  odds  ratio  for  negative  hormone  receptor  status  changed  from  2.1  to 
1.5  (95%  Cl  =  1.4- 1.6).  The  Hosmer-Lemeshow  goodness-of-fit  statistic  for  the  distant 
vs.  localized  tumor  stage  model  also  improved  following  the  exclusion  of  patients  with 
unknown  tumor  grade  (x2=  7.55  with  8  DF,  p  =  0.48),  indicating  that  the  model  fit  the 
data  well  (Figure  III-7).  When  patients  with  ductal  adenocarcinoma  and  those  with 
other  histological  types  combined  were  analyzed  in  separate  regression  models,  the 
patterns  of  association  between  the  study  factors  and  outcomes  were  similar. 

Individual  logistic  regression  models  were  produced  for  White,  Black,  and 
Hispanic  patients  (the  three  largest  groups)  in  order  to  examine  the  consistency  across 
racial/ethnic  group  of  associations  between  sociodemographic  factors,  tumor 
characteristics  and  each  of  the  study  outcomes.  The  odds  ratios  shown  in  Tables  HI-12- 
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14  are  generally  comparable  in  magnitude  for  each  racial/ethnic  group.  One  exception  to 
this  uniformity  of  effect  is  the  lack  of  an  association  between  marital  status  and  larger 
tumor  size  in  Black  women.  Another  exception  is  the  negative  association  between 
negative  hormone  receptor  status  and  larger  tumor  size  seen  for  White  women,  with  no 
significant  association  in  Black  or  Hispanic  women. 


CHAPTER  IV.  DISCUSSION 


A.  Study  Aims  Addressed 

Findings  from  this  population-based  study  of  106,607  female  breast  cancer 
patients  addressed  the  following  two  main  research  aims:  (1)  To  describe  the 
racial/ethnic  distribution  of  selected  demographic,  socioeconomic,  and  tumor 
characteristics  (stage  of  disease,  tumor  size,  tumor  grade,  estrogen/progesterone  receptor 
status)  that  influence  prognosis  for  cancer  of  the  female  breast;  and  (2)  To  assess  the 
importance  of  sociodemographic  factors  in  explaining  racial/ethnic  differences  in  the 
distribution  of  tumors  by  stage  and  size  at  the  time  of  diagnosis. 

Regarding  the  first  study  aim:  several  racial/ethnic  differences  in  tumor 
characteristics  and  sociodemographic  factors  were  noted  in  this  study  population.  The 
tendency  for  White  and  Japanese  women  to  be  diagnosed  at  an  earlier  stage  than  other 
groups  has  been  documented  in  Hawaii  [LeMarchand  1984,  Meng  1997].  The  poorer 
stage  distribution  in  Black  [Ownby  1985,  Polednak  1986,  Bain  1986,  Bassett  1986, 
Stanford  1989,  Farley  1989,  Ragland  1991,  Wells  1992,  Chen  1994,  Eley  1994,  Simon 
1996,  Jones  1998],  Hispanic  [Samet  1987,  Bentley  1998],  and  American  Indian  women 
[Samet  1987]  has  also  been  noted  by  others.  This  is  the  only  population-based  study,  to 
my  knowledge,  that  has  characterized  tumor  grade  and  hormone  receptor  status  for  breast 
cancer  patients  in  specific  racial/ethnic  groups  other  than  Whites  or  Blacks. 

To  address  the  second  study  aim,  multiple  logistic  regression  models  were  used  to 
determine  whether  the  observed  racial/ethnic  differences  in  tumor  stage  and  tumor  size  at 
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the  time  of  diagnosis  persist  after  adjustment  for  sociodemographic  factors  and  biological 
characteristics  of  the  tumors.  Elevated  odds  ratios  for  later  stage  or  larger  size  tumors 
among  Black  patients  and  Hispanic  patients  were  reduced  by  about  50%-60%  after 
adjustment  for  sociodemographic  factors.  Evidence  for  the  role  of  differential  tumor 
biology  in  accounting  for  the  racial/ethnic  differences  in  tumor  stage  and  size  was  not  as 
compelling.  When  information  on  tumor  grade  and  hormone  receptor  status  were  added 
to  the  regression  models  already  containing  sociodemographic  variables,  odds  ratios  for 
the  Black  and  Hispanic  women  declined  further  for  distant  stage  disease,  but  did  not 
markedly  change  for  regional  stage  disease  or  tumor  size. 

Odds  ratios  for  Black  and  Hispanic  women  relative  to  White  women  remained 
slightly  elevated  after  adjusting  for  sociodemographic  and  tumor  biology  characteristics 
for  distant  and  regional  stage  disease.  In  the  analysis  of  tumor  size,  odds  ratios  for  Black, 
Hispanic,  Filipino,  and  Korean  women  remained  elevated  relative  to  White  women  after 
adjustment  for  sociodemographic  factors,  tumor  grade,  and  hormone  receptor  status. 
Japanese  women,  conversely,  had  a  consistently  lower  odds  than  White  women  for  each 
study  outcome.  These  lower  odds  persisted  even  after  adjusting  for  other  study  factors. 

B.  Study  Strengths 

Strengths  of  this  study  include  the  large  patient  population  size  and  the  fact  that 
cases  were  identified  through  population-based  cancer  registries.  SEER  Program 
registries  cover  approximately  14%  of  the  entire  United  States  population  and  include 
geographic  regions  with  diverse  racial/ethnic  groups.  As  a  result,  this  study  was  able  to 
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assess  patterns  of  breast  cancer  severity  in  a  larger  number  of  racial/ethnic  groups  than 
prior  investigations  that  were  limited  to  one  or  a  few  central  registries  or  a  limited 
number  of  hospitals  or  clinical  trial  groups  [Mohla  1982,  Ownby  1985,  Pegararo  1986, 
Polednak  1986,  Beverly  1987,  Stanford  1987,  Stanford  1989,  Farley  1989  Mandelblatt 
1991,  Wells  1992,  Chen  1994,  Ellege  1994,  Hulka  1994,  Gordon  1995,  Weiss  1995, 
Gapstur  1996,  Krieger  1997a,  Bentley  1998,  Elmore  1998,  Lannin  1998].  Further, 
patients  in  the  present  study  include  all  eligible  breast  cancer  cases  from  the  populations 
in  a  defined  set  of  geographic  areas  and  are  not  subject  to  the  influence  of  referral 
patterns  which  may  affect  hospital-based  or  clinical  trial-based  case  selection. 

Other  study  strengths  include  the  high  percentage  of  patient  diagnoses  that  were 
microscopically  confirmed  (99%)  and  our  ability  to  assess  the  importance  of  selected 
biological  characteristics  of  the  tumors  when  evaluating  racial/ethnic  differences  in 
tumor  stage  and  size  at  diagnosis.  Finally,  the  use  of  a  geographic  linkage  enabled  us  to 
assess  the  role  of  neighborhood-level  sociodemographic  factors  on  the  cancer  outcomes. 

C.  Study  Limitations 

Only  limited  risk  factor  information  is  available  from  cancer  registry  records  on 
the  study  subjects.  Individual  information  on  factors  such  as  body  mass,  alcohol  and 
tobacco  use,  reproductive  history,  medical  insurance  status,  usual  source  of  health  care, 
and  screening  behavior  would  have  been  helpful  in  this  analysis  of  tumor  characteristics 
at  the  time  of  diagnosis.  Although  the  utility  and  advantages  of  neighborhood-level 
measures  of  socioeconomic  position  are  well  documented  [Hakama  1982,  Massey  1990, 
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Kaplan  1996,  Krieger  1997b],  the  addition  of  individual  socioeconomic  information 
would  have  allowed  a  multi-level  assessment  of  the  importance  of  these  factors  in  our 
study  population  [Krieger  1992,  Anderson  1997,  Krieger  1997b].  The  lack  of  individual 
socioeconomic  data  may  also  lead  to  residual  confounding  by  socioeconomic  position. 
This  residual  confounding  and/or  the  influence  of  other  important  unmeasured  factors 
could  explain  the  persistence  of  slightly  elevated  odds  ratios  for  some  racial/ethnic 
groups  in  this  study. 

In  spite  of  the  large  study  population,  the  relatively  small  number  of  American 
Indians  (n  =  136)  made  it  difficult  to  detect  statistically  meaningful  differences  for  this 
group.  Odds  ratios  associated  with  more  advanced  stage  of  disease  were  consistently 
elevated  for  American  Indian  women  and  were  comparable  to  the  excesses  seen  for 
Black  women,  but  due  to  the  small  population  size,  95%  confidence  limits  always 
included  one.  Only  one  of  the  SEER  registries  in  the  present  study.  New  Mexico,  is 
currently  able  to  accurately  report  cancer  incidence  data  for  American  Indians.  The 
Alaska  Native  tumor  registry  has  recently  entered  the  SEER  Program  and  will  provide 
useful  data  for  future  studies.  Efforts  to  improve  reporting  are  underway  in  other 
registries,  but  current  misclassification  of  American  Indians  into  other  racial/ethnic 
groups  leads  to  significant  under-reporting  for  this  group  [Sugarman  1996]. 

Furthermore,  the  cancer  patterns  among  American  Indians  are  known  to  vary  by  region 
and  tribe  [IHS,  1997] ,  so  small  population  sizes  will  continue  to  hinder  epidemiologic 
research  on  these  groups. 

Another  study  limitation  was  the  large  percentage  of  patients  with  missing 
information  on  hormone  receptor  status  and  tumor  grade.  Given  the  widespread 
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recognition  of  the  utility  of  determining  hormone  receptor  status  for  predicting  response 
to  hormonal  therapy  and  the  importance  of  tumor  grade  in  assessing  prognosis,  the  lack 
of  this  information  in  patient  records  is  troublesome.  Henson  reported  that  many 
physicians  consider  the  assignment  of  tumor  grade  to  be  too  subjective  to  be  of  much 
prognostic  use  and  this  may  explain  why  it  was  missing  for  25%  of  our  study  cases 
[Henson  1991].  Henson  noted  that  a  strong  relationship  between  histologic  grade  and 
patient  survival  persists  in  spite  of  interobserver  and  intraobserver  variability. 

D.  Interpretation 

In  spite  of  its  limitations,  this  study  represents  the  largest  analysis  of  breast  cancer 
among  women  in  diverse  racial/ethnic  groups  to  date  and  clearly  indicates  that 
sociodemographic  factors  may  play  an  important  role  in  accounting  for  observed 
racial/ethnic  differences  in  the  stage  of  disease  and  tumor  size  at  the  time  of  diagnosis. 
This  supports  findings  from  several  earlier  studies  [Ownby  1985,  Polednak  1986,  Farley 
1989,  Mandelblatt  1991,  Wells  1992,  Elmore  1995,  Weiss  1995,  Bentley  1998,  Lannin 
1998], 

There  are  a  number  of  ways  that  sociodemographic  factors  may  be  influencing  the 
stage  and  size  of  breast  tumors  at  the  time  of  diagnosis.  The  association  in  this  study 
between  marital  status  and  tumor  stage  and  size  supports  the  results  from  a  study  of 
several  cancer  types  in  New  Mexico  [Goodwin  1987].  It  has  been  postulated  that 
married  persons  may  tend  to  have  better  health  habits  and  less  delay  in  seeking  medical 
care  after  the  occurrence  of  symptoms  than  unmarried  persons.  In  addition,  married 
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persons  tend  to  have  higher  socioeconomic  status  and  greater  social  support  [Goodwin 
1987].  Several  investigators  have  emphasized  the  impact  of  socioeconomic  factors  on 
access  to  physician  care  or  screening  services  [Gregorio  1983,  Harper  1993,  Hoffman- 
Goetz  1998].  Mammography  use  has  been  found  to  be  positively  associated  with 
income,  education,  having  health  insurance  coverage,  having  a  usual  source  of  care,  and 
urban  residence  [Rakowski  1993,  Horton  1992,  Horton  1996,  Breen  1994,  Katz  1994, 
Anderson  1995,  Coughlin  1999,  Makuc  1999].  Therefore,  programs  to  promote 
screening  mammography  that  target  primary  care  physicians  and  women  with  low 
incomes  and  education  have  been  recommended  [Breen  1994,  Eley  1994].  Studies  to 
evaluate  the  efficacy  of  this  type  of  intervention  would  also  be  helpful. 

Several  surveys  indicate  that  the  use  of  mammography  in  the  United  States  has 
risen  over  time,  though  the  majority  of  breast  cancers  are  still  first  discovered  either  by 
the  patient  through  breast  self-exam  or  as  an  incidental  finding,  or  by  a  clinical  breast 
exam  [Norton  1992,  Benedict  1996,  McPherson  1997].  Self-reported  data  on  women 
aged  40  years  or  more,  who  were  interviewed  in  38  states  from  1989  to  1997  as  a  part  of 
the  Behavioral  Risk  Factor  Surveillance  System  (BRFSS),  showed  that  the  largest 
increases  in  mammography  usage  (defined  as  having  a  mammogram  within  the  previous 
two  years)  occurred  in  those  with  lower  education  and  lower  income  [Blackman  1999]. 
Personal  interview  data  from  the  National  Health  Interview  Survey,  spanning  1987  to 
1994,  indicated  that  recent  increases  in  mammography  screening  were  greatest  for  Black 
women  with  low  family  incomes  and  had  stabilized  for  low-income  White  women  and  all 
women  with  higher  family  incomes  [Makuc  1999].  National  estimates  from  the  Jacobs 
Institute  of  Women’s  Health  (JIWH)  Mammography  Attitudes  and  Usage  Study  of  1995 
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also  indicated  that  increases  in  mammography  use  occurred  in  recent  years  among 
women  with  lower  family  incomes  [Horton  1996].  A  surprising  finding  from  this  survey 
was  the  slight  decline  since  1992  in  regular  mammography  screening  among  women  with 
college  degrees.  The  authors  suggested  that  this  may  have  been  due  to  increased  public 
concern  about  potential  health  risks  associated  with  radiation  exposure  from 
mammography.  Despite  the  general  increase  over  time  in  the  use  of  mammography  for 
early  detection,  however,  all  of  these  surveys  indicate  that  sociodemographic  differentials 
persist  with  women  in  lower  income  and  education  groups  having  lower  screening  rates. 

Current  racial/ethnic  patterns  in  the  use  of  mammography  has  been  reported  from 
several  large  surveys  [Breen  1994,  Bums  1996,  Horton  1996,  Blackman  1999].  Personal 
interview  data  from  the  1990  National  Health  Interview  Survey  indicated  that  White, 
Black  and  Hispanic  women  aged  40  years  and  older  had  comparable  overall  rates  of 
screening  mammograms  within  the  previous  year  [Breen  1994],  A  similar  finding  was 
reported  in  the  JIWH  national  survey  for  Black  and  White  women,  although  the 
investigators  noted  that  the  percentage  of  women  in  compliance  with  current  American 
Cancer  Society  mammography  screening  guidelines  is  still  less  than  optimal  for  every 
racial/ethnic  group  [Horton  1996].  Racial/ethnic  patterns  of  mammography  use  were 
also  reported  from  data  collected  through  telephone  interviews  with  a  representative 
sample  of  the  civilian,  noninstitutionalized  adult  population  of  states  participating  in  the 
Behavioral  Risk  Factor  Surveillance  System  (BRFSS)  [Blackman  1999].  The  BRFSS 
study  results  indicated  comparable  rates  of  mammography  within  the  past  two  years 
among  White,  Black,  and  Asian  American  or  Pacific  Islander  groups,  but  a  lower  rate 
among  the  American  Indian  or  Alaska  Native  population. 
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Tumor  size  greater  than  1  cm  was  used  in  the  present  study  as  an  indicator  of 
delayed  detection.  Tumors  smaller  than  1  cm  are  primarily  found  by  screening 
mammography,  whereas  larger  tumors  are  often  detected  by  other  methods  such  as 
symptoms,  clinical  breast  exam,  or  breast  self-exam  [Fletcher  1985,  Reintgen  1993, 
Helzlsouer  1995J.  A  recently  published  analysis  of  tumor  size  and  stage  in  Asian 
American  women  with  breast  cancer  reported  similar  results  to  those  from  the  present 
study;  namely,  that  women  in  Chinese,  Filipino  and  Korean  American  groups  were  more 
likely  than  White  women  to  be  diagnosed  with  a  tumor  size  greater  than  1  cm  [Hedeen 
1999].  Japanese  women  in  both  studies  had  a  slightly  lower  odds  than  White  women  for 
a  tumor  size  greater  than  1  cm.  The  similar  findings  are  not  surprising  since  Hedeen  et  al 
also  based  their  study  on  cases  identified  from  SEER  Program  registries,  though  their 
study  period  (diagnoses  between  1988  and  1994)  differed  somewhat  from  the  present 
study  and  included  cases  only  from  the  five  registries  with  the  greatest  number  of  Asian 
Americans. 

The  findings  from  the  current  study  and  the  study  by  Hedeen  suggest  that  there 
may  be  a  relative  delay  in  the  diagnosis  of  breast  cancer  among  women  in  these  ethnic 
groups.  Survey  data  on  health  behaviors  among  women  in  California  have  indicated  that 
Chinese,  Filipino,  Korean,  and  Vietnamese  women  are  less  likely  to  report  ever  having 
had  a  mammogram  than  are  women  in  the  general  population  [CDC  1992a,  CDC  1992b, 
CDC  1994,  Hiatt  1996,  CDC  1997,  Maxwell  1997].  This  provides  indirect  evidence  that 
lower  utilization  of  mammography  by  these  ethnic  groups  may  be  associated  with  the 
diagnosis  of  more  advanced  tumors. 

An  additional  finding  from  the  study  by  Hedeen  et  al  was  that  the  increased  odds 
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ratio  for  larger  tumor  size  was  limited  to  Asian  American  women  who  were  bom  in  Asia. 
The  investigators  suggest  that  a  woman’s  birthplace  and  level  of  acculturation  or 
assimilation  may  influence  her  beliefs  and  behaviors  with  respect  to  medical  care  in 
general  and  mammography  screening  utilization  in  particular.  Unfortunately,  place  of 
birth  information  was  unavailable  for  nearly  half  of  the  patients  in  the  current  study, 
precluding  an  examination  of  this  factor.  A  study  of  breast  cancer  screening  and 
screening-related  attitudes  among  Filipino-American  women  in  California  reported  lower 
screening  rates  in  this  group  than  in  Black  or  White  women  in  the  1994  California 
Behavioral  Risk  Factor  Study  [Maxwell  1997].  Factors  associated  with  lower  screening 
rates  in  Filipino  women  in  this  survey  included  lack  of  a  physician  recommendation  for  a 
mammogram,  concern  over  cost,  belief  that  a  mammogram  is  only  needed  in  the 
presence  of  symptoms,  perceived  inconvenience  or  difficulties  in  getting  to  the 
mammography  facility,  and  embarrassment  [Maxwell  1997]. 

Many  studies,  in  addition  to  the  present  one,  have  found  that  socioeconomic 
effects  alone  do  not  account  for  all  of  the  racial/ethnic  differences  in  tumor  stage  at 
diagnosis  [Vernon  1985,  Bain  1986,  Mandelblatt  1991,  Richardson  1992,  Wells  1992, 
Hunter  1993].  Even  in  situations  where  universal  access  to  medical  care  is  provided, 
racial/ethnic  disparities  in  breast  cancer  diagnosis  or  outcome  persist  [Track  1993,  Katz 
1994,  Wojcik  1998].  Cultural  factors  such  as  beliefs,  attitudes  and  knowledge  about 
cancer  have  been  shown  to  vary  by  race/ethnicity  and  have  been  found  to  influence 
cancer  screening  and  prevention  behaviors  [Michielutte  1982,  Jepson  1991,  Loehrer 
1991,  Perez-Stable  1992,  Harper  1993,  Pachter  1994,  Maxwell  1997,  Lannin  1998, 

Lobell  1998].  Results  from  a  recent  case-control  study  of  breast  cancer  patients 
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diagnosed  in  a  hospital  primarily  serving  residents  of  two  rural  counties  in  eastern  North 
Carolina  indicated  that  psychosocial  and  cultural  variables  in  conjunction  with 
socioeconomic  factors  are  sufficient  to  explain  the  difference  in  stage  at  diagnosis 
between  Black  and  White  women  [Lannin  1998].  Because  30%  of  the  cancers  in  Whites 
and  1 1%  in  Blacks  were  discovered  by  routine  screening  mammography  in  the  study  by 
Lannin  et  al,  it  would  seem  logical  to  conclude  that  cultural  beliefs  are  associated  with 
differential  use  of  screening  mammography.  Another  study  conducted  on  women  in  the 
same  community  at  the  same  time,  however,  found  that  a  woman’s  knowledge  and 
beliefs  had  little  influence  on  her  use  of  screening  mammography  [O’Malley  1997].  The 
most  important  factor  was  whether  mammography  was  recommended  to  the  patient  by  a 
physician.  Since  the  majority  of  early  and  late  stage  cancers  were  found  by  the  patient  in 
the  study  by  Lannin  et  al,  the  investigators  concluded  that  the  most  important  effect  of 
the  cultural  beliefs  is  that  they  lead  to  delayed  presentation  once  a  woman  has  developed 
a  palpable  breast  abnormality  [Lannin  1998],  Another  study  of  breast  cancer  patients 
identified  within  an  HMO  setting  in  North  Carolina  also  reported  that  patient  delay 
before  reporting  breast  cancer  symptoms  to  a  physician  was  an  important  factor  in 
explaining  tumor  stage  at  the  time  of  diagnosis  [Howard  1998]. 

Several  studies  have  reported  an  inverse  correlation  between  socioeconomic 
status  and  body  mass  [Allan  1993,  Millar  1993].  Increased  body  mass  or  obesity,  in  turn, 
has  been  linked  to  later  stage  breast  tumors  [Mohle-Boetani  1988,  Daniell  1988, 
Verreault  1989,  Reeves  1996,  Jones  1997]  or  larger  size  tumors  [Senie  1992, 
Bastarrachea  1994],  However,  at  least  one  study  has  not  found  an  association  between 
obesity  and  tumor  stage  [Howson  1986].  The  mechanism  behind  more  advanced  breast 
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cancer  and  obesity  is  unknown,  but  endocrinologic  factors  leading  to  increased  levels  of 
endogenous  estrogen  have  been  suggested  [Morabia  1990,  Schapira  1991,  Bernstein 
1993,  Maggino  1993,  Hulka  1994,  Kuiler  1994,  Hankinson  1995].  Obesity  may  also  (or 
alternatively)  play  a  diagnostic  role.  Some  studies  have  suggested  that  obesity  makes 
early  detection  more  difficult  [Austin  1979,  Zumoff  1983,  Mohle-Boetani  i988,  Ingram 
1989]  or  that  physician  approach  to  patients  that  are  obese  may  differ  [Weiss  1995]. 

Findings  from  the  present  study  indicate  that  differences  in  tumor  grade  and 
hormone  receptor  status  play  a  role  in  explaining  the  increased  diagnosis  of  distant  stage 
cancers  among  Black  women,  even  after  sociodemographic  factors  are  taken  into 
account.  This  result  supports  earlier  findings  by  Chen  et  al  who  compared  tumor 
characteristics  among  Black  and  White  breast  cancer  patients  diagnosed  in  1985  and 
1986  in  three  urban  SEER  registries  [Chen  1994].  Chen  et  al  noted  a  racial/ethnic 
difference  in  estrogen  receptor  status  after  adjustment  for  socioeconomic  position,  body 
mass  index,  use  of  alcohol  and  tobacco,  reproductive  experience,  health  care  access,  and 
usual  source  of  care.  Our  finding  that  Japanese  patients  tended  to  be  diagnosed  at  an 
earlier  stage  and  smaller  tumor  size  than  other  groups  after  adjustment  for  all  other  study 
factors  has  been  reported  by  others  [Ward-Hinds  1982,  LeMarchand  1984,  Stemmermann 
1985,  Higuchi  1993,  Hedeen  1999].  It  has  further  been  suggested  that  differences  in 
histopathologic  features  between  the  breast  cancers  of  Japanese  and  White  women  may 
indicate  possible  biological  differences  between  the  groups  [Stemmermann  1991, 

Higuchi  1993]. 


72 


E.  Public  Health  Importance 

Data  base  linkage,  as  was  done  for  this  study  using  the  NCI  cancer  surveillance 
data  file  and  the  1990  U.S.  census  data  file,  can  provide  an  important  means  for 
enhancing  the  utility  of  routinely  collected  disease  surveillance  data.  As  additional  risk 
factor  information  is  added  to  a  disease  surveillance  system,  more  analytic  studies 
become  feasible.  These  studies  enable  the  surveillance  system  to  be  used,  not  only  for 
routine  monitoring  of  disease  trends  in  the  population,  but  also  for  improving  our 
understanding  of  potential  factors  underlying  and  explaining  the  trends.  Data  base 
linkage  also  has  the  advantage  of  generally  being  less  expensive  and  more  quickly 
accomplished  than  having  to  conduct  field  studies  requiring  the  collection  of  new  data  on 
individuals.  Data  base  linkage  does  not  replace  the  need  for  in-depth,  epidemiologic 
investigations  of  specific  public  health  questions,  but  does  play  a  useful,  complimentary 
role  in  efforts  to  better  understand  the  patterns  of  disease  in  populations. 

In  this  study,  the  addition  of  sociodemographic  data  to  the  cancer  surveillance  file 
provided  a  unique  opportunity  to  evaluate  the  importance  of  these  factors  in  explaining 
racial/ethnic  differences  in  the  stage  and  size  of  breast  cancers  at  the  time  of  diagnosis. 
This  data  base  linkage  will  also  enable  the  conduct  of  additional  studies  to  evaluate  the 
importance  of  sociodemographic  factors  in  explaining  population  patterns  for  other  types 
of  cancer.  It  may  be  particularly  useful  in  the  study  of  specific  cancers  for  which  there 
have  been  recent  advances  in  the  methods  of  detection  or  treatment  (e.g.,  prostate- 
specific  antigen  screening  tests  for  cancer  of  the  prostate)  which  might,  in  turn,  be 
expected  to  differentially  impact  different  socioeconomic  groups. 
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F.  Future  Directions 

The  results  from  this  study  suggest  that  sociodemographic  factors  account  for  a 
significant  portion  of  the  observed  racial/ethnic  differences  in  the  stage  of  disease  and 
tumor  size  at  the  time  of  diagnosis,  but  that  differences  in  biological  characteristics  of 
breast  tumors,  at  least  among  Black  women,  can  not  be  ruled  out.  It  would  be  useful  to 
confirm  these  findings  in  additional  studies  that  include  central  histopathology  review 
and  that  include  patient-level  socioeconomic  data,  as  well  as  area-based  measures.  The 
identification  of  new,  valid  and  reliable  tumor  markers  would  allow  a  more  precise 
characterization  of  meaningful  racial/ethnic  or  sociodemographic  differences  in  breast 
tumor  types  for  future  studies.  Further  studies  are  also  needed  to  determine  whether 
differential  exposure  to  carcinogens  or  genetic  susceptibility  are  important  in  explaining 
the  more  aggressive  forms  of  breast  cancer  in  specific  patient  subgroups. 

Additional  studies  should  investigate  the  roles  of  recent  immigration  and 
culturally-linked  health  behavior  patterns  among  breast  cancer  patients  in  explaining 
racial/ethnic  patterns  for  late  stage  at  diagnosis.  Since  a  socioeconomic  disparity  in 
mammography  screening  levels  has  been  documented  in  several  population  surveys, 
methods  for  increasing  compliance  with  recommended  guidelines  should  be  identified, 
implemented,  and  then  evaluated  for  their  efficacy.  Future  studies  could  also  focus  on 
sociodemographic  differences  in  the  quality  of  mammography,  whether  mammography  is 
received  at  regular  intervals,  and  whether  appropriate  follow-up  and  treatment  is  given  to 
identified  cases. 
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TABLE  1-1.  Five-year  cumulative  relative  survival  rates  by  stage  of  disease  at  diagnosis 
for  female  breast  cancer  cases,  all  ages,  diagnosed  1989-94  [source:  Ries  1998]. 


Localized 

Regional 

Distant 

Local+Regional 

♦Distant 

Unstaged 

0.96 

(0.96,  0.9 7)* 

0.77 

(0.76,  0.77) 

0.22 

(0.20,  0.24) 

0.86 

(0.85,  0.86) 

0.51 

(0.48,  0.54) 

Confidence  limits  based  on  the  survival  rate  ♦/-  (2*standard  error). 
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TABLE  1-2.  Summary  of  TNM-based  stage  groupings  for  invasive  breast  cancers. 
Stage  I: 

The  cancer  is  no  larger  than  2  centimeters  and  has  not  spread  outside  the  breast 

Stage  IIA  is  defined  by  either  of  the  following: 

The  cancer  is  no  larger  than  2  centimeters  but  has  spread  to  the  lymph  nodes  under  the 
arm  (the  axillary  lymph  nodes). 

The  cancer  is  between  2  and  5  centimeters  but  has  not  spread  to  the  axillary  lymph 
nodes. 

Stage  IIB  is  defined  by  either  of  the  following: 

The  cancer  is  between  2  and  5  centimeters  and  has  spread  to  the  axillary  lymph  nodes. 
The  cancer  is  larger  than  5  centimeters  but  has  not  spread  to  the  axillary  lymph  nodes. 

Stage  I  IIA  is  defined  by  either  of  the  following: 

The  cancer  is  smaller  than  5  centimeters  and  has  spread  to  the  lymph  nodes  under  the 
arm,  and  the  lymph  nodes  are  attached  to  each  other  or  to  other  structures. 

The  cancer  is  larger  than  5  centimeters  and  has  spread  to  the  lymph  nodes  under  the 
arm. 

Stage  II IB  is  defined  by  either  of  the  following: 

The  cancer  has  spread  to  tissues  near  the  breast  (skin  or  chest  wall,  including  the  ribs 
and  the  muscles  in  the  chest). 

The  cancer  has  spread  to  lymph  nodes  inside  the  chest  wall  along  the  breast  bone. 
Stage  IV: 


The  cancer  has  spread  to  other  organs  of  the  body  (most  often  the  bones,  lungs,  liver,  or 
brain).  Or,  the  tumor  has  spread  locally  to  the  skin  and  lymph  nodes  inside  the  neck, 
near  the  collarbone. 
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TABLE  I-3a.  Overview  of  treatment  options  for  invasive  cancer  of  the  female  breast  by 
tumor  stage  [adapted  from:  PDQ  1999]. 


STAGE  I  BREAST  CANCER 
Treatment  may  be  one  of  the  following: 

1 .  Breast-conserving  surgery  to  remove  only  the  cancer  and  some  surrounding 
breast  tissue  (lumpectomy)  or  to  remove  part  of  the  breast  (partial  or  segmental 
mastectomy);  both  are  followed  by  radiation  therapy.  Some  of  the  axillary  lymph 
nodes  are  also  removed.  This  treatment  provides  identical  long-term  cure  rates 
as  those  from  mastectomy.  A  doctor's  recommendation  on  which  procedure  to 
have  is  based  on  tumor  size  and  location  and  its  appearance  on  the 
mammogram. 

2.  Surgery  to  remove  the  whole  breast  (total  mastectomy)  or  the  whole  breast  and 
the  lining  over  the  chest  muscles  (modified  radical  mastectomy).  Some  of  the 
axillary  lymph  nodes  are  also  taken  out. 

Adjuvant  therapy  (given  in  addition  to  the  treatments  listed  above): 

1 .  Chemotherapy. 

2.  Hormone  therapy. 

3.  Clinical  trials  of  more  aggressive  adjuvant  chemotherapy  in  certain  patients. 

4.  Clinical  trials  of  no  adjuvant  therapy  for  patients  with  a  favorable  prognosis. 

5.  Clinical  trials  of  ovarian  ablation  or  suppression. 
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TABLE  I-3b.  Overview  of  treatment  options  for  invasive  cancer  of  the  female  breast  by 
tumor  stage  [adapted  from:  PDQ  1999]. 


STAGE  II  BREAST  CANCER 

Treatment  may  be  one  of  the  following: 

1 .  Breast-conserving  surgery  to  remove  only  the  cancer  and  some  surrounding 
breast  tissue  (lumpectomy)  or  to  remove  part  of  the  breast  (partial  or  segmental 
mastectomy);  both  are  followed  by  radiation  therapy.  Some  of  the  axillary  lymph 
nodes  are  also  removed.  This  treatment  provides  identical  long-term  cure  rates 
as  those  from  mastectomy.  A  doctor's  recommendation  on  which  procedure  to 
have  is  based  on  tumor  size  and  location  and  its  appearance  on  the 
mammogram. 

2.  Surgery  to  remove  the  whole  breast  (total  mastectomy)  or  the  whole  breast  and 
the  lining  over  the  chest  muscles  (modified  radical  mastectomy).  Some  of  the 
axillary  lymph  nodes  are  also  taken  out. 

Adjuvant  therapy  (given  in  addition  to  the  treatments  listed  above): 

1 .  Chemotherapy  with  or  without  hormonal  therapy. 

2.  Hormone  therapy. 

3.  Clinical  trial  of  chemotherapy  before  surgery  (neoadjuvant  therapy). 

Clinical  trials  of  high-dose  chemotherapy  with  bone  marrow  transplantation  for 
patients  with  cancer  in  more  than  three  lymph  nodes. 


4. 
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TABLE  I-3c.  Overview  of  treatment  options  for  invasive  cancer  of  the  female  breast  by 
tumor  stage  [adapted  from:  PDQ  1999]. 


STAGE  III  BREAST  CANCER 

Stage  MIA  cancer 

Treatment  may  be  one  of  the  following  surgeries: 

1 .  Surgery  to  remove  the  whole  breast,  the  lining  over  the  chest  muscles,  and  many 
of  the  lymph  nodes  (modified  radical  mastectomy)  or  the  whole  breast,  the  chest 
muscles,  and  all  of  the  lymph  nodes  (radical  mastectomy). 

2.  Radiation  therapy  given  after  surgery. 

3.  Chemotherapy  with  or  without  hormone  therapy  given  with  surgery  and  radiation 
therapy. 

4.  Clinical  trials  are  testing  new  chemotherapy  with  or  without  hormonal  drugs;  they 
are  also  testing  chemotherapy  before  surgery  (neoadjuvant  therapy). 

5.  Clinical  trials  of  high-dose  chemotherapy  with  bone  marrow  or  peripheral  stem 
cell  transplantation. 


Stage  IMB  cancer: 

The  patient  will  probably  have  a  biopsy  and  then  be  given  one  or  more  of  the  following: 

1 .  Surgery  (radical  or  modified  radical  mastectomy)  and/or  radiation  therapy  to  the 
breast  and  the  lymph  nodes. 

2.  Chemotherapy  with  or  without  hormones  to  shrink  the  tumor,  followed  by  surgery 
and/or  radiation  therapy. 

3.  Hormonal  therapy  followed  by  additional  therapy. 

4.  Clinical  trials  are  testing  new  chemotherapy  drugs  and  biological  therapy,  new 
drug  combinations,  and  new  ways  of  giving  chemotherapy. 

5.  Clinical  trials  of  high-dose  chemotherapy  with  bone  marrow  or  peripheral  stem 
cell  transplantation. 
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TABLE  I-3d.  Overview  of  treatment  options  for  invasive  cancer  of  the  female  breast  by 
tumor  stage  [adapted  from:  PDQ  1999]. 


STAGE  IV  BREAST  CANCER 

The  patient  will  probably  have  a  biopsy  and  then  be  given  one  or  more  of  the  following: 

1 .  Radiation  therapy  or,  in  some  cases,  a  mastectomy  to  reduce  the  symptoms. 

2.  Hormonal  therapy  with  or  without  surgery  to  remove  the  ovaries. 

3.  Combination  chemotherapy. 

4.  Clinical  trials  are  testing  new  chemotherapy  and  hormonal  drugs  and  new 
combinations  of  drugs  and  biological  therapy. 

5.  Clinical  trials  of  high-dose  chemotherapy  with  bone  marrow  or  peripheral  stem 
cell  transplantation. 
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TABLE  II-l.  Counties  and  census  tracts  (or  block  numbering  areas)  included  in  SEER 

areas. 


No.  of  No.  of  Census  Avg.  Population  Size 
SEER  Area _ Counties  Tracts  or  BNA  per  Tract  or  BNA 


Los  Angeles 

1 

1.652 

5.365 

Detroit 

3 

1,088 

3.596 

San  Francisco  &  Oakland 

5 

843 

4,373 

Connecticut 

8 

834 

3,941 

Iowa 

99 

783 

3.546 

Seattle 

3 

754 

4,465 

San  Jose  &  Monterey 

4 

546 

3.882 

Utah 

29 

400 

4.307 

New  Mexico 

33 

390 

3.885 

Atlanta 

5 

367 

5,933 

Hawaii 

5 

265 

4,182 

All  Areas 

195 

7.922 

SEER  =  Surveillance.  Epidemiology  and  End  Results  program. 
BNA  =  block  numbering  area. 

Data  source:  Bureau  of  the  Census.  1990  (STF-3A  data  file). 
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TABLE  II-2a.  Breast  cancer  study  size  requirements  for  detecting  specified  odds  ratios 
(OR),  where  Case  =  distant  stage  cancer.  Unexposed  =  white.  Exposed  =  other  specific 
racial/ethnic  group,  a  =  0.95,  1-P  =  0.80,  P(D|Unexposed)  =  0.054. 


Available  Study  Size 

Study  Size  Needed 

Unexposed  :  Exposed 

OR 

White 

Black 

=  White  :  Black 

1.5 

8,093 

861 

=  84,446  :  9,031 

1.2 

44,218 

4,704 

=  9.4  :  1 

1.1 

168.354 

17,910 

0.9 

155.222 

16,513 

0.8 

37,628 

4,003 

0.6 

8,789 

935 

Unexposed  :  Exposed 

OR 

White 

Hispanic 

=  White  :  Hispanic 

1.5 

10,008 

841 

=  84,446  :  7,074 

1.2 

54,776 

4,603 

=  11.9  :  1 

1.1 

208,690 

17,537 

0.9 

192,649 

16,189 

0.8 

46,731 

3,927 

0.6 

10,924 

918 

Unexposed  :  Exposed 

OR 

White 

Japanese 

=  White  :  Japanese 

1.5 

35,403 

785 

=  84,446  :  1,871 

1.4 

53,128 

1,178 

=  45.1  :  1 

1.2 

194,967 

4,323 

0.8 

167,637 

3,717 

0.7 

72,250 

1,602 

0.6 

39,327 

872 

Unexposed  :  Exposed 

OR 

White 

Filipino 

=  White  :  Filipino 

1.5 

41,759 

782 

=  84,446  :  1,581 

1.4 

62,692 

1,174 

=  53.4  :  1 

1.2 

230,047 

4,308 

0.8 

197,847 

3,705 

0.7 

85,280 

1,597 

0.6 

46.405 

869 

Unexposed  :  Exposed 

OR 

White 

Chinese 

=  White  :  Chinese 

1.5 

47,502 

780 

=  84,446  :  1,387 

1.4 

71,314 

1,171 

=  60.9  :  1 

1.3 

117,537 

1,930 

0.8 

225,147 

3,697 

0.7 

97,075 

1,594 

0.6 

52,800 

867 
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TABLE  II-2b.  Breast  cancer  study  size  requirements  for  detecting  specified  odds  ratios 
(OR),  where  Case  =  distant  stage  cancer.  Unexposed  =  white,  Exposed  =  other  specific 
racial/ethnic  group,  a  =  0.95,  l-P  =  0.80,  P(D|Unexposed)  =  0.054. 


Available  Study  Size 

Study  Size  Needed 

Unexposed  :  Exposed 

OR 

White 

Hawaiian 

=  White :  Hawaiian 

2.0 

38,180 

230 

=  84,446  :  508 

1.7 

70,218 

423 

=  166:1 

1.5 

127,820 

770 

0.5 

87,980 

530 

0.4 

58,598 

353 

0.3 

41,168 

248 

Unexposed  :  Exposed 

OR 

White 

Korean 

=  White  :  Korean 

2.0 

64,349 

229 

=  84,446  :  301 

1.9 

76,713 

273 

=  281  :  1 

1.5 

215,808 

768 

0.5 

148,649 

529 

0.4 

99,193 

353 

0.3 

69,407 

247 

Un exposed  :  Exposed 

OR 

White 

Vietnamese 

=  White  :  Vietnamese 

2.0 

70,990 

229 

=  84,446  :  272 

1.9 

84,630 

273 

=  310  :  1 

1.5 

238,080 

768 

0.5 

163,990 

529 

0.4 

109,430 

353 

0.3 

76,570 

247 

OR  White  American  Indian  (NM) 

2.4  81,351  131 

2.3  91,908  148 

2.0  141,588  228 

0.3  153,387  247 

0.2  110,538  178 

0.1  79,488  128 


Unexposed  :  Exposed 

=  White  :  American  Indian  (NM) 

=  84,446  :  136 
=  621  :  1 
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TABLE  II-3.  Individual  and  area-based  study  variables. 


INDIVIDUAL-LEVEL  VARIABLES 

1 .  Cancer  type 

2.  Stage  of  disease  at  the  time  of  cancer  diagnosis 

3.  Tumor  size  at  the  time  of  cancer  diagnosis 

4.  Tumor  grade 

5.  Estrogen  receptor  status 

6.  Progesterone  receptor  status 

7.  Age  in  years  at  the  time  of  cancer  diagnosis 

8.  Race/ethnicity 

9.  Sex 

1 0.  Marital  status  at  the  time  of  cancer  diagnosis 

1 1 .  SEER  registry 

1 2.  Census  tract  and  county  of  residence  at  the  time  of  cancer  diagnosis 


CENSUS  TRACT-LEVEL  VARIABLES 

1 .  %  Employed  persons,  age  16+,  in  “working-class”  occupations  (listed  below) 

administrative  support  and  ciehcal;  sales;  private 
household  and  other  service  occupations  (excl.  protective 
services);  precision  production,  craft  and  repair;  machine 
operators,  assemblers  and  inspectors;  transportation  and 
material  moving;  handlers,  equipment  cleaners,  helpers 
and  laborers 

2.  Median  family  income 

3.  Median  household  income 

4.  %  Persons  with  an  income  below  the  poverty  line 

5.  %  Families  with  an  income  below  the  poverty  level,  a  female  householder  (no 
husband  present),  and  related  children  <18  years  old. 

6.  %  Households  owning  their  home 

7.  %  Households  owning  no  car 

8.  %  Persons,  age  25+,  that  have  not  completed  high  school 

9.  %  Persons  living  in  an  urban  area 

10.  %  Unemployed  among  persons,  age  16+,  in  labor  force 

11.  %  Persons  bom  in  foreign  country 
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TABLE  II-4.  Distribution  of  staged  and  unstaged  cancers  of  the  female  breast  in  study 
population  by  race/ethnicity. 


Race/Ethnicity _ _ Staged3  Unstagedb _ Total 


White 

82,031  (97%) 

2,415  (3%) 

84,446  (100%) 

Black 

8,567  (95%) 

464  (5%) 

9,031  (100%) 

Hispanic' 

6,844  (97%) 

230  (3%) 

7,074  (100%) 

Japanese 

1,844  (99%) 

27  (1%) 

1,871  (100%) 

Filipino 

1,543  (98%) 

38  (2%) 

1,581  (100%) 

Chinese 

1,353  (98%) 

34  (2%) 

1.387  (100%) 

Hawaiian 

499  (98%) 

9  (2%) 

508  (100%) 

Korean 

287  (95%) 

14  (5%) 

301  (100%) 

Vietnamese 

268  (99%) 

4  (1%) 

272  (100%) 

American  Indian  (NM)d 

135  (99%) 

1  (6%) 

136  (100%) 

Total 

103,371  (97%) 

3.236  (3%) 

106,607(100%) 

a  Staged  =  Extent  of  disease  information  was  sufficient  to  assign  one  of  the  following  stages: 
localized,  regional,  or  distant  disease. 

0  Unstaged  =  Extent  of  disease  information  was  insufficient  to  assign  a  stage. 

c  Since  persons  of  Hispanic  ethnicity  may  be  of  any  race,  Hispanic  cases  were  removed  from  all 
other  racial/ethnic  categories  and  combined  to  form  this  group. 

d  American  Indians  in  New  Mexico  only. 


TABLE  II-5.  Distribution  of  staged  and  unstaged  cancers  of  the  female  breast  in  study 
population  by  age  at  diagnosis. 


Age  at  Diagnosis _ Staged*  _ Unstagedb  _ Total 


<40 

6,468 

(97%) 

223 

(3%) 

6,691  (100%) 

40-49 

18,199 

(98%) 

439 

(2%) 

18,638  (100%) 

50-59 

20,523 

(98%) 

471 

(2%) 

20,994  (100%) 

60-69 

23,391 

(98%) 

497 

(2%) 

23,888  (100%) 

70-79 

22,867 

(97%) 

635 

(3%) 

23,502  (100%) 

80-89 

10,481 

(94%) 

702 

(6%) 

11,183  (100%) 

90+ 

1,442 

(84%) 

269 

(16%) 

1.711  (100%) 

All  Ages 

103,371 

(97%) 

3,236 

(3%) 

106,607  (100%) 

3  Staged  =  Extent  of  disease  information  was  sufficient  to  assign  one  of  the  following  stages: 
localized,  regional,  or  distant  disease. 


b  Unstaged  = 


Extent  of  disease  information  was  insufficient  to  assign  a  stage. 
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TABLE  II-6.  Distribution  of  staged  cancers  of  the  female  breast  in  study  population  by 
registry  and  availability  of  census  tract-level  socioeconomic  information. 


Socioeconomic  Information  Available? 


Registry _ _ Yes _ _ No  _ Total 


San  Francisco  &  Oakland 

11,902  (96%) 

535  (4%) 

12,437  (100%) 

Connecticut 

11.705  (99%) 

107  (1%) 

11,812  (100%) 

Detroit 

12,652  (99%) 

110  (1%) 

12,762  (100%) 

Hawaii 

2,594  (86%) 

425  (14%) 

3,019  (100%) 

Iowa 

9,715  (99%) 

84  (1%) 

9,799  (100%) 

New  Mexico 

3,875  (93%) 

299  (7%) 

4,174  (100%) 

Seattle 

10,420  (91%) 

975  (9%) 

11,395  (100%) 

Utah 

3,813  (>99%) 

16  (<1%) 

3,829  (100%) 

Atlanta 

5,675  (93%) 

418  (7%) 

6,093  (100%) 

San  Jose  &  Monterey 

5,433  (93%) 

402  (7%) 

5,835  (100%) 

Los  Angeles 

21,714  (98%) 

502  (2%) 

22,216(100%) 

Total 

99.498  (96%) 

3,873  (4%) 

103,371  (100%) 
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TABLE  II-7-  Distribution  of  staged  cancers  of  the  female  breast  in  study  population  by 
race/ethnicity  and  availability  of  census  tract-level  socioeconomic  information. 

Socioeconomic  Information  Available? 


Race/Ethnicity _ _ Yes  No _ _ Total 


White 

79,045  (96%) 

2,986  (4%) 

82,031  (100%) 

Black 

8,329  (97%) 

238  (3%) 

8,567  (100%) 

Hispanic3 

6.579  (96%) 

265  (4%) 

6,844  (100%) 

Japanese 

1,733  (94%) 

111  (6%) 

1,844  (100%) 

Filipino 

1,442  (93%) 

101  (7%) 

1,543  (100%) 

Chinese 

1,309  (97%) 

44  (3%) 

1,353  (100%) 

Hawaiian 

400  (80%) 

99  (20%) 

499  (100%) 

Korean 

277  (97%) 

10  (3%) 

287  (100%) 

Vietnamese 

257  (96%) 

11  (4%) 

268  (100%) 

American  Indian  (NM)b 

127  (94%) 

8  (6%) 

135  (100%) 

Total 

99.498  (96%) 

3,873  (4%) 

103,371  (100%) 

3  Since  persons  of  Hispanic  ethnicity  may  be  of  any  race.  Hispanic  cases  were  removed  from  all 
other  racial/ethnic  categories  and  combined  to  form  this  group. 

b  American  Indians  in  New  Mexico  only. 
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TABLE  II-8-  Distribution  of  female  breast  cancer  cases  in  study  population  by 
race/ethnicity  and  availability  of  tumor  size  information. 

Tumor  Size  Available? 


Race/Ethnicity _  Yes _ _ No _  Paget’s*  Total 


White 

76,843  (91.0%) 

7,512  (8.9%) 

91  (0.1%) 

84,446  (100%) 

Black 

7,963  (88.2%) 

1,062  (11.7%) 

6  (0.1%) 

9,031  (100%) 

Hispanic6 

6,463  (91.4%) 

605  (8.5%) 

6  (0.1%) 

7,074  (100%) 

Japanese 

1.731  (92.5%) 

137  (7.3%) 

3  (0.2%) 

1,871  (100%) 

Filipino 

1,467  (92.8%) 

112  (7.1%) 

2  (0.1%) 

1,581  (100%) 

Chinese 

1,267  (91.4%) 

34  (8.5%) 

2  (0.1%) 

1,387  (100%) 

Hawaiian 

470  (92.5%) 

38  (7.5%) 

0  (0.0%) 

508  (100%) 

Korean 

280  (93.0%) 

21  (7.0%) 

0  (0.0%) 

301  (100%) 

Vietnamese 

259  (95.2%) 

13  (4.8%) 

0  (0.0%) 

272  (100%) 

American  Indian  (NM)C 

128  (94.1%) 

8  (5.9%) 

0  (0.0%) 

136  (100%) 

Total 

96,871  (90.9%) 

9,626  (9.0%) 

110  (0.1%) 

106,607  (100%) 

a  Paget's  disease  without  an  underlying  tumor 

6  Since  persons  of  Hispanic  ethnicity  may  be  of  any  race,  Hispanic  cases  were  removed  from  all 
other  racial/ethnic  categories  and  combined  to  form  this  group. 

c  American  Indians  in  New  Mexico  only. 


TABLE  II-9.  Distribution  of  female  breast  cancer  cases  in  study  population  by  age  at 
diagnosis  and  availability  of  tumor  size  information. 


Tumor  Size  Available? 

Age  at  Diagnosis _ Yes _ _ No _  Paget's*  _ Total 


<40 

6.073  (90.8%) 

612  (9.1%) 

6  (0.1%) 

6,691  (100%) 

40-49 

17,009  (91.3%) 

1,619  (8.7%) 

10  (<0.1%) 

18,638  (100%) 

50-59 

19,172  (91.3%) 

1,812  (8.6%) 

10  (<0.1%) 

20,994  (100%) 

60-69 

21,785  (91.2%) 

2,066  (8.6%) 

37  (0.2%) 

23,888  (100%) 

70-79 

21,478  (91.4%) 

1,999  (8.5%) 

25  (0.1%) 

23,502  (100%) 

80-89 

9,961  (89.1%) 

1,201  (10.7%) 

21  (0.2%) 

11,183  (100%) 

90+ 

1,393  (81.4%) 

317  (18.5%) 

1  (0.1%) 

1,711  (100%) 

All  Ages 

96.871  (90.8%) 

9,626  (9.0%) 

110  (0.1%) 

106.607  (100%) 

Paget’s  disease  without  an  underlying  tumor 
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TABLE  11-10.  Distribution  of  female  breast  cancer  cases  in  study  population  with  tumor 
size  information  by  registry  and  availability  of  census  tract-level  socioeconomic 
information. 


Socioeconomic  Information  Available? 


Registry _ _ Yes _ _ No _ _ Total 


San  Francisco  &  Oakland 

11,294  (96%) 

491  (4%) 

11,785  (100%) 

Connecticut 

10,343  (99%) 

91  (1%) 

10,434  (100%) 

Detroit 

11,539  (99%) 

102  (1%) 

11,641  (100%) 

Hawaii 

2,440  (86%) 

390  (14%) 

2,830  (100%) 

Iowa 

9,280  (99%) 

80  (1%) 

9,360  (100%) 

New  Mexico 

281  (7%) 

3,935  (100%) 

Seattle 

10,127  (91%) 

944  (9%) 

11,071  (100%) 

Utah 

3,597  (>99%) 

14  (<1%) 

3,611  (100%) 

Atlanta 

5,268  (93%) 

387  (7%) 

5,655  (100%) 

San  Jose  &  Monterey 

4.974  (93%) 

358  (7%) 

5,332  (100%) 

Los  Angeles 

20,743  (98%) 

474  (2%) 

21,217(100%) 

Total 

93,259  (96%) 

3,612  (4%) 

96,871  (100%) 
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TABLE  11-11.  Distribution  of  female  breast  cancer  cases  in  study  population  with  tumor 
size  information  by  race/ethnicity  and  availability  of  census  tract-level  socioeconomic 
information. 


Socioeconomic  Information  Available? 


Race/Ethnicity _ _ Yes _ _ No  _ Total 


White 

74,053  (96%) 

2,790  (4%) 

76,843  (100%) 

Black 

7,744  (97%) 

219  (3%) 

7.963  (100%) 

Hispanic’ 

6,215  (96%) 

248  (4%) 

6,463  (100%) 

Japanese 

1,630  (94%) 

101  (6%) 

1,731  (100%) 

Filipino 

1.369  (93%) 

98  (7%) 

1.467  (100%) 

Chinese 

1,230  (97%) 

37  (3%) 

1,267  (100%) 

Hawaiian 

379  (81%) 

91  (19%) 

470  (100%) 

Korean 

270  (96%) 

10  (4%) 

280  (100%) 

Vietnamese 

248  (96%) 

11  (4%) 

259  (100%) 

American  Indian  (NM)2 

121  (95%) 

7  (5%) 

128  (100%) 

Total 

93.259  (96%) 

3,612  (4%) 

96.871  (100%) 

1  Since  persons  of  Hispanic  ethnicity  may  be  of  any  race.  Hispanic  cases  were  removed  from  ail 
other  racial/ethnic  categories  and  combined  to  form  this  group. 

2  American  Indians  in  New  Mexico  only. 


TABLE  IH-la.  Distribution  of  selected  characteristics  among  106,607  female  breast  cancer  patients,  diagnosed  1992-1996. 
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TABLE  IH-lb,  cont.  Distribution  of  selected  characteristics  among  106,607  female  breast  cancer  patients,  diagnosed  1992-1996. 
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TABLE  III-2.  Percentage  distribution  of  106,607  invasive  breast  cancers  by  histological  type  and  racial/cthnic  group. 
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TABLE  III-3.  Distribution  of  selected  characteristics  among  invasive  breast  cancer 
patients  by  tumor  stage. 
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Distant  Localized 

(Total  =  5,698)  (Total  =  65,018) 


n 

n 

ORa 

95%  Cl 

Tumor  grade 

1-2,  unknown 

3,198 

47,952 

1.0 

3-4 

2,500 

17,066 

2.2 

2.1,  2.3 

Hormone  receptor  status 

ER  or  PR  positive 

2,344 

40.703 

1.0 

All  other 

3,354 

24,315 

2.4 

2.3.  2.5 

Marital  status 

Mamed  or  unknown 

2,683 

37,267 

1.0 

Not  married 

3,015 

27,751 

1.5 

1.4,  1.6 

Urban  census  tract 

Not  urban 

1,219 

14,848 

1.0 

Urban 

4,479 

50,170 

1.1 

1.0.  1.2 

OR  =  crude  odds  ratio. 


TABLE  III-4.  Distribution  of  selected  characteristics  among  invasive  breast  cancer 
patients  by  tumor  stage. 
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Tumor  grade 
1-2,  unknown 
3-4 

Hormone  receptor  status 
ER  or  PR  positive 
All  other 
Marital  status 

Married  or  unknown 
Not  married 
Urban  census  tract 
Not  urban 
Urban 


a  OR  =  crude  odds  ratio. 


Regional 
(Total  =  28,782) 

Localized 
(Total  =  65,018) 

n 

n 

BB^Sws 

16,249 

47,952 

1.0 

12,533 

17,066 

2.2 

2.1,  22. 

18,040 

40.703 

1.0 

10,742 

24,315 

1.0 

1.0, 1.0 

16,617 

37,267 

1.0 

12,165 

27,751 

1.0 

1.0,  1.0 

6,393 

14,848 

1.0 

22,389 

50.170 

1.0 

1.0,  1.1 
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TABLE  III-5.  Distribution  of  selected  characteristics  among  invasive  breast  cancer 
patients  by  tumor  size. 


1.0  cm+  <1.0  cm 

(Total  =  76,626)  (Total  =  16,633) 


_ 0 _ 

_ 0 _ 

Tumor  grade 

1-2,  unknown 

48.711 

14,008 

1.0 

3-4 

27,915 

2,625 

3.1 

2.9,  3.2 

Hormone  receptor  status 

ER  or  PR  positive 

48,248 

10,302 

1.0 

All  other 

28,378 

6,331 

1.0 

0.9,  1.0 

Marital  status 

Married  or  unknown 

42,817 

10,057 

1.0 

Not  married 

33,809 

6,576 

1.2 

1.2. 1.3 

Urban  census  tract 

Not  urban 

17,146 

3,847 

1.0 

Urban 

59,480 

12,786 

1.0 

1.0,  1.1 

a  OR  =  crude  odds  ratio. 
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Footnotes  for  Table  III-6. 


a  OR  =  odds  ratio  is  adjusted  for  other  explanatory  variables  in  the  regression  model. 

a  SDF  =  sociodemographic  factors  include  %  not  married  at  time  of  diagnosis;  %  without  high 
school  diploma;  %  working  class;  %  families  headed  by  women  with  no  husband  at  home,  with 
one  or  more  children,  and  who  are  living  below  the  poverty  level;  median  family  income;  %  own 
their  home;  %  having  no  car. 
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TABLE  III-6.  Effects  of  selected  risk  factors  on  the  relative  odds  of  a  distant  stage 
breast  cancer  diagnosis  among  specific  racial/ethnic  groups  compared  with  whites. 


Distant  vs. 
Localized 


Variables  in  model 


OR* 

(95%  Cl) 

Age 

White 

1.0 

Black 

2.1 

(1.9,  2.2) 

Hispanic 

1.5 

(1.4,  1.7) 

Japanese 

0.7 

(0.5,  0.9) 

Filipino 

1.0 

(0.7,  1.2) 

Chinese 

1.1 

(0.8,  1.3) 

Hawaiian 

1.4 

(0.9,  2.0) 

Korean 

0.7 

(0.3.  1.2) 

Vietnamese 

0.8 

(0.4, 1.5) 

Am.  Indian 

2.0 

(1-0.  3.7) 

Age,  registry,  urban  area 

White 

1.0 

Black 

2.0 

(1-9.  2.2) 

Hispanic 

1.5 

(1-3.  1.6) 

Japanese 

0.7 

(0.6,  1.0) 

Filipino 

1.0 

(0.8,  1.3) 

Chinese 

1.2 

(0.9,  1.5) 

Hawaiian 

1.5 

(1.0,  2.3) 

Korean 

0.7 

(0.4.  1.3) 

Vietnamese 

0.9 

(0.4,  1.6) 

Am.  Indian 

1.8 

(0.9,  3.3) 

Age,  registry,  urban  area,  SDF6 

White 

1.0 

Black 

1.5 

(1-3.  1.6) 

Hispanic 

1.2 

(1.0.  1.3) 

Japanese 

0.7 

(0.5,  0.9) 

Filipino 

0.9 

(0.7,  1.1) 

Chinese 

1.1 

(0.8,  1.4) 

Hawaiian 

1.3 

(0.8,  2.0) 

Korean 

0.6 

(0.3,  1.2) 

Vietnamese 

0.7 

(0.4.  1.3) 

Am.  Indian 

1.3 

(0.7,  2.5) 

Age,  registry,  urban  area,  SDF*1, 
tumor  grade,  er/pr  status 

White 

1.0 

Black 

1.3 

(1.2,  1.5) 

Hispanic 

1.1 

(1.0, 1.2) 

Japanese 

0.7 

(0.6.  1.0) 

Filipino 

0.9 

(0.7. 1.1) 

Chinese 

1.0 

(0.8,  1.3) 

Hawaiian 

1.4 

(0.9,  2.1) 

Korean 

0.6 

(0.3, 1.0) 

Vietnamese 

0.7 

(0.3, 1.2) 

Am.  Indian 

1.4 
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Footnotes  for  Table  III-7. 


3  OR  =  odds  ratio  is  adjusted  for  other  explanatory  variables  in  the  regression  model. 

D  SDF  =  sociodemographic  factors  include  %  not  married  at  time  of  diagnosis;  %  without  high 
school  diploma;  %  working  class;  %  families  headed  by  women  with  no  husband  at  home,  with 
one  or  more  children,  and  who  are  living  below  the  poverty  level;  median  family  income;  %  own 
their  home;  %  having  no  car. 
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TABLE  III-7.  Effects  of  selected  risk  factors  on  the  relative  odds  of  a  regional  stage 
breast  cancer  diagnosis  among  specific  racial/ethnic  groups  compared  with  whites. 


Regional  vs. 
Localized 

Variables  in  model 


OR* 

(95%  Cl) 

Age 

White 

1.0 

Black 

1.4 

(1.3,  1.4) 

Hispanic 

1.3 

(1.2.  1.3) 

Japanese 

0.7 

(0.7.  0.8) 

Filipino 

1.0 

(0.9.  1.2) 

Chinese 

1.0 

(0.9,  1.1) 

Hawaiian 

1.0 

(0.8.  1.2) 

Korean 

0.9 

(0.7.  1.2) 

Vietnamese 

1.3 

(1.0.  1.6) 

Am.  Indian 

1.5 

(1-0,  2.2) 

Age.  registry,  urban  area 

White 

1.0 

Black 

1.3 

(1.3,  1.4) 

Hispanic 

1.2 

(1.2.  1.3) 

Japanese 

0.8 

(0.7.  0.9) 

Filipino 

1.1 

(0.9.  1.2) 

Chinese 

1.0 

(0.9.  1.2) 

Hawaiian 

1.1 

(0.9,  1.4) 

Korean 

0.9 

(0.7. 1.2) 

Vietnamese 

1.2 

(1.0,  1.6) 

Am.  Indian 

1.4 

(1-0.  2.0) 

Age.  registry,  urban  area.  SDF® 

White 

1.0 

Black 

1.2 

(1.2.  1.3) 

Hispanic 

1.1 

(1.1.  1-2) 

Japanese 

0.8 

(0.7.  0.9) 

Filipino 

1.0 

(0.9,  1.1) 

Chinese 

1.0 

(0.9.  1.2) 

Hawaiian 

1.1 

(0.8,  1.4) 

Korean 

0.9 

(0.7,  1.2) 

Vietnamese 

1.2 

(0.9.  1.5) 

Am.  Indian 

1.3 

(0.9.  1.8) 

Age.  registry,  urban  area,  SDF6, 
tumor  grade,  er/pr  status 

White 

1.0 

Black 

1.2 

(1.1.  1.2) 

Hispanic 

1.1 

(1.1.  1.2) 

Japanese 

0.8 

(0.7.  0.9) 

Filipino 

1.0 

(0.9,  1.1) 

Chinese 

1.0 

(0.9,  1.1) 

Hawaiian 

1.0 

(0.8,  1.3) 

Korean 

0.9 

(0.7.  1.1) 

Vietnamese 

1.1 

(0.9.  1.5) 

Am.  Indian 

1.3 

(0.9. 1.9) 
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Footnotes  for  Table  III-8. 


a  OR  =  odds  ratio  is  adjusted  for  other  explanatory  variables  in  the  regression  model. 

b  SOF  =  sociodemographic  factors  include  %  not  married  at  time  of  diagnosis;  %  without  high 
school  diploma;  %  working  class;  %  families  headed  by  women  with  no  husband  at  home,  with 
one  or  more  children,  and  who  are  living  below  the  poverty  level;  median  family  income;  %  own 
their  home;  %  having  no  car. 
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TABLE  III-8.  Effects  of  selected  risk  factors  on  the  relative  odds  of  a  breast  cancer 
diagnosis  with  tumor  diameter  greater  or  equal  to  1  cm  among  specific  racial/ethnic 
groups  compared  with  whites. 


T umor  size  ge  1  cm 
vs.  <1  cm 


Variables  in  model 


OR*  (95%  Cl) 


Age 

White 

1.0 

Black 

1.8 

(1.7,  2.0) 

Hispanic 

1.6 

(1-4.  1.7) 

Japanese 

0.8 

(0.7,  0.9) 

Filipino 

1.5 

(1-3.  1.8) 

Chinese 

1.2 

(1-0,  1.4) 

Hawaiian 

1.2 

(0.9,  1.7) 

Korean 

1.5 

(1.0,  2.2) 

Vietnamese 

1.3 

(0.9,  2.0) 

Am.  Indian 

1.2 

(0.7,  2.1) 

Age.  registry,  urban  area 

White 

1.0 

Black 

1.8 

(1-7.  2.0) 

Hispanic 

1.5 

(1.3,  1.6) 

Japanese 

0.9 

(0.8.  1.0) 

Filipino 

1.5 

(1.3.  1-8) 

Chinese 

1.2 

(1-0.  1.4) 

Hawaiian 

1.4 

(1-0.  2.0) 

Korean 

1.5 

(1-0.  2.2) 

Vietnamese 

1.3 

(0.9,  1.9) 

Am.  Indian 

1.1 

(0.6.  1.9) 

Age.  registry,  urban  area,  SDF° 

White 

1.0 

Black 

1.5 

(1.3,  1.6) 

Hispanic 

1.2 

(1.1.  1.4) 

Japanese 

0.8 

(0.7.  1.0) 

Filipino 

1.4 

(1.2,  1.6) 

Chinese 

1.2 

(1-0,  1.4) 

Hawaiian 

1.3 

(0.9,  1.8) 

Korean 

1.4 

(1.0,  2.1) 

Vietnamese 

1.1 

(0.7.  1.7) 

Am.  Indian 

0.9 

(0.5.  1.5) 

Age,  registry,  urban  area,  SDFB, 
tumor  grade,  er/pr  status 

White 

1.0 

Black 

1.4 

(1.3,  1.5) 

Hispanic 

1.2 

(1.1.  1-3) 

Japanese 

0.9 

(0.7.  1.0) 

Filipino 

1.4 

(1.1.  1-6) 

Chinese 

1.1 

(1.0,  1.4) 

Hawaiian 

1.2 

(0.9,  1.7) 

Korean 

1.4 

(1-0,  2.1) 

Vietnamese 

1.1 

(0.7.  1.7) 

Am.  Indian 

0.9 

(0.5.  1.6) 

TABLE  III-9.  Estimated  percentage  of  patients  diagnosed  with  distant  stage  disease  by  education8,  hormone  receptor  status6  and 
tumor  grade;  adjusted  for  all  other  factors0  in  full  regression  model. 
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All  other  factors  in  full  model  as  specified  in  Table  111-5. 


TABLE  III-10.  Estimated  percentage  of  patients  diagnosed  with  regional  stage  disease  by  education0,  hormone  receptor  status'*  and 
tumor  grade;  adjusted  for  all  other  factors0  in  full  regression  model. 
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All  other  factors  in  full  model  as  specified  in  Table  111-5. 


TABLE  III-ll.  Estimated  percentage  of  patients  diagnosed  with  tumors  1  cm  or  greater  in  diameter  by  education8,  hormone  receptor 
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All  other  factors  in  full  model  as  specified  in  Table  111-5. 


TABLE  III-12.  Odds  ratios  for  selected  explanatory  variables  in  full  multiple  logistic  regression  model  comparing  female  breast 
cancer  patients  with  distant  stage  disease  to  those  with  localized  stage  disease. 
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TABLE  111-13.  Odds  ratios  for  selected  explanatory  variables  in  full  multiple  logistic  regression  model  comparing  female  breast 
cancer  patients  with  regional  stage  disease  to  those  with  localized  stage  disease. 
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TABLE  HI-14.  Odds  ratios  for  selected  explanatory  variables  in  full  multiple  logistic  regression  model  comparing  female  breast 
cancer  patients  with  tumor  diameter  of  1  cm  or  greater  to  those  with  tumors  <1  cm  in  diameter. 
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FIGURE  II-l.  Selected  demographic  characteristics  of  the  SEER  population  compared  with  those  for  the  total  U.S.  population. 
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FIGURE  II-2.  Selection  of  female  breast  cancer  study  group. 


ALL  FEMALE  BREAST  CANCER  CASES,  1992-1996 

N  =  126,400 

|  Other  ethnic  groups 

(n  =  2,509) 

123,891 

I  Carcinoma  in  situ 

(n  =  16,685) 


Intended  Study  Population:  107,206 


|  Death  certificate  or 

autopsy  cases 
(n  =  599) 

106,607 


Analysis  by  Tumor  Stage 


Missing  stage 
(n  =  3,236) 


103,371 


Analysis  by  Tumor  Size 


96,871 


Missing  size 
(n  =  9,736) 


Missing  SES 
(n  =  3.873) 


Tumor  Stage 
Study  Group 
99,498 


Missing  SES 
(n  =  3,612) 


Tumor  Size 
Study  Group 
93,259 
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FIGURE  Ill-la.  Ln-Odds  plots  of  distant  stage  disease  by  explanatory  variables. 


Ag«  at  Diagnosis  %  without  High  School  Diploma 


*  Percentage  of  families  headed  by  women  with  no  husband  at  home,  with  one  or  more 
children,  and  who  are  living  below  the  poverty  level. 
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FIGURE  Ill-lb.  Ln-Odds  plots  of  distant  stage  disease  by  explanatory  variables. 
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FIGURE  III-lc.  Ln-Odds  plots  of  distant  stage  disease  by  explanatory  variables. 
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FIGURE  III-2a.  Ln-Odds  plots  of  regional  stage  disease  by  explanatory  variables. 


Aga  at  Diagnosis 


*  Percentage  of  families  headed  by  women  with  no  husband  at  home,  with  one  or  more 
children,  and  who  are  living  below  the  poverty  level. 
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FIGURE  III-2b.  Ln-Odds  plots  of  regional  stage  disease  by  explanatory  variables. 


36 


FIGURE  III-2c.  Ln-Odds  plots  of  regional  stage  disease  by  explanatory  variables. 
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FIGURE  UI-3a.  Ln-Odds  plots  of  tumor  size  greater  than  or  equal  to  1  cm  by 
explanatory  variables. 


Aga  at  Diagnosis 


*  Percentage  of  families  headed  by  women  with  no  husband  at  home,  with  one  or  more 
children,  and  who  are  living  below  the  poverty  level. 


FIGURE  UI-3b.  Ln-Odds  plots  of  tumor  size  greater  than  or  equal  to  1  cm  by 
explanatory  variables. 


i 
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FIGURE  III-3c.  Ln-Odds  plots  of  tumor  size  greater  than  or  equal  to  1  cm  by 
explanatory  variables. 
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FIGURE  III-4.  Plot  of  observed  and  expected  number  of  observations  for  Hosmer- 
Lemeshow  test  of  goodness-of-fit  of  model  for  distant  stage  disease  vs.  localized 
stage  disease. 

Hosmer  and  Lemeshow  GOF  Statistic  =  56.228  with  8  DF  (p  =  0.0001) 


Thousands  of  observations 


♦Observed  —Expected 
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FIGURE  III-5.  Plot  of  observed  and  expected  number  of  observations  for  Hosmer- 
Lemeshow  test  of  goodness-of-fit  of  model  for  regional  stage  disease  vs.  localized 
stage  disease. 

Hosmer  and  Lemesbow  GOF  Statistic  =  49.709  with  8  DF  (p  =  0.0001) 


Thousands  of  observations 


Group  (decile) 
■♦■Observed  —Expected 
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FIGURE  HI-6.  Plot  of  observed  and  expected  number  of  observations  for 
Hosmer-  Lemeshow  test  of  goodness-of-fit  of  model  for  tumor  size  greater  than  or 
equal  to  1.0  cm  vs.  tumor  size  less  than  1.0  cm. 

Hosmer  and  Lemeshow  GOF  Statistic  =  6.7037  with  8  DF  (p  =  0.5689) 
Thousands  of  observations 


Group  (decile) 
Observed  —  Expected 
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FIGURE  III-7.  Plot  of  observed  and  expected  number  of  observations  for  Hosmer- 
Lemeshow  test  of  goodness-of-fit  of  model  for  distant  stage  disease  vs.  localized 
stage  disease  after  excluding  patients  with  unknown  tumor  grade. 

Hosmer  and  Lemeshow  GOF  Statistic  =  7.5475  with  8  DF  (p  =  0.4789) 


Thousands  of  observations 


♦Observed  —  Expected 
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APPENDIX  Il-la.  Geocoding  Update  Instrument. 


Geocoding  Update  Instrument 

Reporting  Site  Name:  _ 

Address: 


September  1997,  Version  3 


Phone:  _  Fax: 


The  most  knowledgeable  expert  on  geocoding  at  your  registry  should  complete  this  form: 

Name:  _ 


Phone:  _ Pax: _ E-mail: 


1 .  What  year  did  you  begin  geocoding? 

Census  Tract?  o _ Year 


Block  group?  □ _ Year 


2.  Current  Practices: 

-  What  software  is  used? _ 

•  How  do  you  currently  obtain  geocoded  information  at  the  Census  tract  level? 

□  In-house  □  Offside,  Skip  to  #4  □  Both,  Complete  #3  &  #4 

The  following  questions  pertain  to  equipment  and  personnel  costs  incurred  by  your  registry  in  obtaining  geocoded  information. 


If  Inhouse: 


Software  costs  to  the  registry: 

•  Initial  purchase  price  _ 

•  Ongoing  maintenance  and  support  costs 

(Specify  annual,  quarterly,  etc.) _ 


Personnel  costs  to  the  registry: 


FTE  Equivalent 

Job  Description  (skill  level,  degree  required,  etc.) 

Annual  Salary 

Specify  other  personnel  costs  you  incur  in  performing  geocoding. 
Other  costs  (e.g.  supplies,  equipment,  transportation): 


Category 

Description 

Annual  Costs 
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APPENDIX  Il-lb.  Geocoding  Update  Instrument 


Geocoding  Update  Instrument _ September  1997,  Version  3 

4-  If  Offside:  (Specify  private  contractor,  state  department  of  health,  etc. ) 


Facility  Name: _ 

Individual  Contract  Name: 


Phooe: _ Fax: _ E-mail:. 

Charge  to  registry  (specify  annual,  quarterly,  etc.)  $ _ 

List  any  "no  cost*  services  provided  to  the  registry: 


5.  Oo  you  test  whether  Census  tract  codes  generated  are  valid? 

a  Yes  □  No 

Describe  method  used  to  test  for  validity  or  verify  accuracy  and  year  implemented: 


Year 

Method 

6.  What  additional  effort  do  you  take  to  obtain  information  on  unmatched  cases? 

□  None  □  Specify _ 

7.  Is  a  probability  method  used  to  assign  uncertain  matches? 

□  None  □  Specify 

8.  Oo  you  geocode  at  the  block  group  level? 

□  Yes  □  No 

Specify 

9.  If  you  don't  currently  geocode  at  the  block  group  level,  how  would  you  have  to  change  your  operations 
in  order  to  do  so?  Describe  necessary  procedures,  costs,  and  time  involved  in  modifying  and 
upgrading  your  system  to  achieve  this. 
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APPENDIX  II-lc.  Geocoding  Update  Instrument 


Geocoding  Update  Instrument _ September  1997,  Version  3 

10.  Specify  any  structural  problems  that  make  geocoding  difficult  in  your  area. 


11.  Do  you  collect  information  on  Socio-economic  status  (SES,  measured  in  terms  of  income,  education  or 
employment/ 

occupation)  on  your  SEER  cases?  Please  specify: 


1 2.  Please  list  concerns  your  registry  has  pertaining  to  geocoding  that  you  wish  to  discuss  at  the  SEER 
meeting. 


Please  return  this  form  NO  LATER  THAN  SEPTEMBER  26  to: 

Kathleen  C.  Barry,  Applied  Research  Program 

Executive  Plaza  North.  Room  313 

6130  Executive  Boulevard,  MSC  7344 

Bethesda,  MO  20892-7339 

Phone:  301-496-5410 

Fax:  301-435-3710 

E-mail:  barrykiSdcpcepn. nd.nih.gov 
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APPENDIX  II-2a.  Changes  to  SEER  Data  Coding  Manual  as  a  Result  of  Pilot  Study. 


CENSUS  TRACT/BLOCK  NUMBERING  AREA  (BNA) 

Section  III,  Field  01.B _ 


|  Census  Tract/Block  Numbering  Area  (BNA) 

|  The  census  tract/block  numbering  area  is  assigned  to  the  patient’s  residence  at  the  time  of 
|  diagnosis.  For  cases  diagnosed  1988  and  forward,  1990  definitions  of  census  tract  and  block 
|  numbering  area  must  be  used. 

|  If  an  area  is  assigned  a  census  tract/block  numbering  area  (BNA)  code  and  the  code  is  not 
j  available,  code  as  ‘999999.’ 

|  If  an  area  is  not  assigned  a  census  tract  or  BNA  (1980  or  prior  censuses  only),  code  as  ‘000000.’ 

Census  tract  numbers  should  be  right  justified  and  zero  filled  so  that  all  six  positions  have  a  code 
entered.  For  purposes  of  coding  census  tract,  assume  that  the  decimal  point  is  located  between 
the  fourth  and  fifth  positions  of  this  field.  Thus,  census  tract  ‘409.6  ’  would  be  coded 
‘040960’and  census  tract  ‘516.21’  would  be  coded  ‘051621.’ 

|  BNA  codes  are  6  digits  in  length,  as  are  census  tract  codes.  They  can  be  distinguished  by  their 
I  range.  Census  tract  codes  range  from  0001.00  to  9499.99,  while  BNAs  range  from  9501.00  to 

j  9989.99.  The  decimal  point  is  ignored.  For  example,  BNA  code  9607.23  would  be  coded 

j  ‘960723.’ 

A  census  tract  is  a  small  statistical  subdivision  of  a  county  with  (generally)  between  2,500  and 
8,000  residents.  The  boundaries  of  census  tracts  are  established  cooperatively  by  local 
committees  and  the  Census  Bureau.  An  attempt  is  made  to  keep  the  same  boundaries  from  census 
to  census  so  that  historical  comparability  will  be  maintained.  This  goal  is  not  always  achieved; 
old  tracts  may  be  subdivided  due  to  population  growth,  disappear  entirely,  or  have  their 
boundaries  changed.  Between  1970  and  1 980  the  number  of  tracts  increased  by  over  20  percent. 
Thus  it  is  important  to  know  which  definitions  were  used  for  the  coding  of  the  census  tracts:  the 
1970  definitions,  the  1980  definitions,  or  starting  with  1988  diagnoses,  the  1990  definitions. 

Some  parts  of  the  country  identify  areas  with  block  numbering  areas  (BNAs)  codes.  These  are 
the  geographic  equivalent  of  a  census  tract.  BNAs  were  implemented  in  the  1990  census.  The 
BNA  is  always  a  subunit  of  a  county  and  census  tracts/BNAs  are  mutually  exclusive;  that  is,  a 
given  county  is  subdivided  into  either  census  tracts  or  BNAs,  but  not  both.  There  may  be  as  few 
as  one  or  two  BNAs  per  county,  or  more  than  20  BNAs  per  county. 

Note  that  Block  Group  coding  is  different  than  block  numbering  area  coding  and  is  not  currently 
collected  by  SEER. 
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APPENDIX  II-2b.  Changes  to  SEER  Data  Coding  Manual  as  a  Result  of  Pilot  Study. 


CODING  SYSTEM  FOR  CENSUS  TRACT 

Section  III,  Field  01.C _ 


Coding  System  for  Census  Tract 

Code 

0  Not  traded 

1  1970  Census  Tract  Definitions  (1973-77) 

2  1 980  Census  Tract  Definitions  ( 1 978-87) 

3  1 990  Census  Tract  Definitions  ( 1 988+) 

4  2000  Census  Tract  Definitions 


Note:  Do  not  implement  code  ‘4’  until  instructed  by  SEER. 
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APPENDIX  II-2c.  Changes  to  SEER  Data  Coding  Manual  as  a  Result  of  Pilot  Study. 


|  CENSUS  TRACT  CERTAINTY 

|  Section  HI,  Field  01.D _ 

|  Census  Tract  Certainty 

|  Code 

|  1  Census  tract/BNA  based  on  complete  and  valid  street  address  of  residence 

|  2  Census  tract/BNA  based  on  residence  ZIP+4 

|  3  Census  tract/BNA  based  on  residence  ZIP+2 

|  4  Census  tract/BNA  based  on  residence  ZIP  only 

|  5  Census  tract/BNA  based  on  ZIP  of  post  office  box 

|  9  Unable  to  assign  census  tract  or  block  numbering  based  on  available 

|  information/unknown 

|  This  field  is  a  code  indicating  the  basis  of  assignment  of  census  tract  or  block  numbering  area 
|  (BNA)  for  an  individual  record.  It  is  helpful  in  identifying  cases  census  tracted/BNA’d  from 
|  incomplete  information  or  a  post  office  box  address.  Most  of  the  time,  this  information  is 
|  provided  by  a  geocoding  vendor  service.  Alternatively,  the  code  is  manually  assigned  by  central 
|  registry  staff.  Codes  are  hierarchical,  with  lower  numbers  having  priority. 

|  Use  code  ‘  1  ’  when  census  tract  or  block  numbering  area  is  assigned  with  certainty.  This  can 
j  result  either  from  a  computer  match  using  geocoding  software  or  from  manual  searches. 

!  Example  1  Complete  and  valid  street  address  used. 

j  Example  2  Rural  route  or  incomplete  street  address  is  used,  but  is  known  to  lie  entirely 

|  within  one  census  tract 

|  Use  codes  ‘2’  through  ‘5’  when  census  tract  or  block  numbering  area  is  assigned  with  some 
!  uncertainty. 

|  Example  3  Street  address  is  incomplete  or  invalid,  or  only  rural  route  number  is  available, 

|  but  ZIP  code  of  residence  is  known.  The  case  may  be  geocoded  manually  or 

|  geocoded  using  software.  The  case  is  placed  at  the  geographic  center  of  the 

|  ZIP  code  area,  i.e.,  the  ZIP  code  “centroid."  Use  code  "4. ' 

|  Example  4  Post  office  box  number  and  ZIP  code  used.  Use  code  ‘5.  ’ 

|  Note:  Avoid  using  P.O.  box  mailing  address,  when  possible,  as  this  is  not  the  true 

|  residence  of  the  patient. 

|  Use  code  ‘9’  when  the  ZIP  code  is  missing,  when  the  complete  address  of  the  patient  cannot  be 
|  determined  or  when  there  is  insufficient  information  to  assign  census  tract  or  BNA. 

|  Use  of  this  code  is  required  effective  with  cases  diagnosed  1/1/98  and  after.  It  is  strongly 
|  suggested  that  this  information  be  obtained  from  the  geocode  vendor  for  cases  back  to  1988. 


APPENDIX  III-l.  Pearson  Correlation  Coefficients;  Prob  >  |R|  under  Ho:  Rho=0;  N  =  102,419. 
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RTEDUNOH  =  %  without  high  school  diploma;  RTUNEMPL  =  %  unemployed;  RTWRKCLA  =  %  working  class;  RTP0VBEL=  %  below  poverty  level; 

RTPOVFAM  =  %  families  headed  by  women  with  no  husband  at  home,  with  one  or  more  children,  and  who  arc  living  below  the  poverty  level;  RTMEDFIN  =  median  family 
income;  RTOWNERS  =  %  own  their  home;  RTN0NE0R  =  %  having  no  car;  RTFORGBO  =  %  foreign-bom. 
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APPENDIX  III-2.  Odds  ratios  for  explanatory  variables  in  full  model  comparing 
female  breast  cancer  patients  with  distant  stage  disease  to  those  with  localized  disease. 


Distant  vs. 
Localized 


Variables  in  model 


OR* 

(95%  Cl) 

P“ 

White 

1.0 

Black 

1.3 

(1-2,  1.5) 

.0001 

Hispanic 

1.1 

(1.0,  1.2) 

.2452 

Japanese 

0.7 

(0.6,  1.0) 

.0316 

Filipino 

0.9 

(0.7. 1.1) 

.2610 

Chinese 

1.0 

(0.8. 1.3) 

.7159 

Hawaiian 

1.4 

(0.9,  2.1) 

.1741 

Korean 

0.6 

(0.3,  1.0) 

.0951 

Vietnamese 

0.7 

(0.3. 1.2) 

.2032 

American  Indian 

1.4 

(0.7,  2.6) 

.3343 

Age  (10  yr  increments) 

1.0 

(10.  1.0) 

.6218 

Registry  0 

1.0 

Registry  1 

0.9 

(0.9.  1.1) 

.7405 

Registry  2 

1.1 

(1-0.  1.3) 

.0183 

Registry  3 

1.1 

(0.9.  1.2) 

.4187 

Registry  4 

1.0 

(0.8,  1.3) 

.8857 

Registry  5 

1.1 

(1-0.  1.2) 

.2799 

Registry  6 

1.2 

(1-0.  1.4) 

.0386 

Registry  7 

1.0 

(0.9.  1.1) 

.5184 

Registry  8 

0.9 

(0.7.  1.1) 

.1864 

Registry  9 

1.0 

(0.9.  1.2) 

.6217 

Registry  10 

1.0 

(0.8,  1.1) 

.5121 

Urban 

1.0 

(0.9.  1.1) 

.9467 

Not  married 

1.4 

(1-3,  1.5) 

.0001 

No  high  school  diploma  (10%  increments) 

1.1 

(1.0,  1.1) 

.0008 

Working  class  job  (10%  increments) 

1.0 

(1.0.  1.1) 

.3355 

Median  family  income  ($  20  thousands) 

0.9 

(0.9,  1.0) 

.0107 

Families  <poverty.  female  head  of  house  (10%  increments) 

1.0 

(0.9.  1.1) 

.8715 

Home  ownership  (10%  increments) 

1.0 

(1.0,  1.0) 

.2417 

No  car  (10%  increments) 

1.0 

(1.0,  1.0) 

.8409 

Negative  for  hormone  receptors 

2.1 

(2.0.  2.2) 

.0001 

Advanced  grade  tumor  (grades  3  or  4) 

2.0 

(1.8.  2.1) 

a  OR  =  odds  ratio  adjusted  for  ail  other  explanatory  variables  in  the  model. 
b  p-value  for  the  Wald  chi-square  statistic  with  1  degree  of  freedom. 
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APPENDIX  III-3.  Odds  ratios  for  explanatory  variables  in  full  model  comparing 
female  breast  cancer  patients  with  regional  stage  disease  to  those  with  localized  disease. 


Regional  vs. 
Localized 


Variables  in  model 


OR* 

(95%  Cl) 

Pb 

White 

1.0 

Black 

1.2 

(1.1.  1-2) 

.0001 

Hispanic 

1.1 

(1.1. 1.2) 

.0003 

Japanese 

0.8 

(0.7.  0.9) 

.0008 

Filipino 

1.0 

(0.9.  1.1) 

.9928 

Chinese 

1.0 

(0.9.  1.1) 

.9968 

Hawaiian 

1.0 

(0.8,  1.3) 

.8213 

Korean 

0.9 

(0.7.  1.1) 

.3344 

Vietnamese 

1.1 

(0.9. 1.5) 

.3967 

American  Indian 

1.3 

(0.9.  1.9) 

.2256 

Age  (10  yr  increments) 

0.9 

(0.9,  0.9) 

.0001 

Registry  0 

1.0 

Registry  1 

1.0 

(0.9.  1.0) 

.0726 

Registry  2 

0.9 

(0.8.  0.9) 

.0002 

Registry  3 

1.0 

(1-0.  1.1) 

.3796 

Registry  4 

0.8 

(0.7,  0.9) 

.0001 

Registry  5 

0.8 

(0.8,  0.9) 

.0001 

Registry  6 

1.0 

(0.9.  1.1) 

.9880 

Registry  7 

0.8 

(0.8.  0.9) 

.0001 

Registry  8 

1.1 

(1.0,  1.2) 

.1491 

Registry  9 

1.0 

(0.9.  1.1) 

.8744 

Registry  10 

1.0 

(0.9.  1.0) 

.1922 

Urban 

1.0 

(0.9.  1.0) 

.0304 

Not  married 

1.1 

(1-0.  1.1) 

.0001 

No  high  school  diploma  (10%  increments) 

1.0 

(1-0.  1.1) 

.0008 

Working  class  job  (10%  increments) 

1.0 

(1-0.  1.0) 

.2931 

Median  family  income  ($  20  thousands) 

1.0 

(1.0.  1.0) 

.9581 

Families  <poverty.  female  head  of  house  (10%  increments) 

1.0 

(1.0.  1.1) 

.1763 

Home  ownership  (10%  increments) 

1.0 

(1.0.  1.0) 

.4512 

No  car  (10%  increments) 

1.0 

(l.o.  1.0) 

.1686 

Negative  for  hormone  receptors 

0.8 

(0.8,  0.8) 

.0001 

Advanced  grade  tumor  (grades  3  or  4) 

2.1 

.0001 

a  OR  =  odds  ratio  adjusted  for  all  other  explanatory  variables  in  the  model. 
b  p-value  for  the  Wald  chi-square  statistic  with  1  degree  of  freedom. 


APPENDIX  III-4.  Odds  ratios  for  explanatory  variables  in  full  model  comparing 
female  breast  cancer  patients  with  tumors  greater  than  or  equal  to  1  cm  in  diameter  to 
those  with  tumors  less  than  1  cm  in  diameter. 
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1  cm  +  vs. 
<1  cm 


Variables  in  model 


OR* 

(95%  Cl) 

Pb 

White 

1.0 

Black 

1.4 

(1.3,  1.5) 

.0001 

Hispanic 

1.2 

(1.1.  1.3) 

.0001 

Japanese 

0.9 

(0.7,  1.0) 

.0660 

Filipino 

1.4 

(1-1.  1-6) 

.0009 

Chinese 

1.1 

(1.0.  1.4) 

.1469 

Hawaiian 

1.2 

(0.9. 1.7) 

.3202 

Korean 

1.4 

(1-0.  2.1) 

.0956 

Vietnamese 

1.1 

(0.7.  1.7) 

.7326 

American  Indian 

0.9 

(0.5,  1.6) 

.6709 

Age  (10  yr  increments) 

0.9 

(0.9.  1.0) 

.0001 

Registry  0 

1.0 

Registry  1 

0.9 

(0.9,  0.9) 

.0166 

Registry  2 

0.9 

(0.8.  1.0) 

.0036 

Registry  3 

0.8 

(0.7.  0.9) 

.0001 

Registry  4 

0.7 

(0.6.  0.9) 

.0001 

Registry  5 

0.7 

(0.6.  0.8) 

.0001 

Registry  6 

0.9 

(0.8.  1.0) 

.0506 

Registry  7 

0.7 

(0.6.  0.8) 

.0001 

Registry  8 

0.9 

(0.8.  1.0) 

.4035 

Registry  9 

1.0 

(0.9.  1.1) 

.6223 

Registry  10 

1.0 

(0.9.  1.1) 

.9709 

Urban 

0.9 

(0.9.  1.0) 

.0003 

Not  married 

1.2 

(1-2.  1.3) 

.0001 

No  high  school  diploma  (10%  increments) 

1.1 

(1-0.  1.1) 

.0030 

Working  class  job  (10%  increments) 

1.0 

(1-0.  1.1) 

.0579 

Median  family  income  ($  20  thousands) 

<1.0 

(0.9.  1 .0) 

.0250 

Families  <poverty,  female  head  of  house  (10%  increments) 

1.0 

(1-0.  1.1) 

.2954 

Home  ownership  (10%  increments) 

1.0 

(1-0.  1.0) 

.2285 

No  car  (10%  increments) 

1.0 

(0.9.  1.0) 

.3461 

Negative  for  hormone  receptors 

0.7 

(0.7.  0.8) 

.0001 

Advanced  grade  tumor  (grades  3  or  4) 

3.3 

(3.1.  3.4) 

a  OR  =  odds  ratio  adjusted  for  all  other  explanatory  variables  in  the  model. 
b  p-value  for  the  Wald  chi-square  statistic  with  1  degree  of  freedom. 


